(19) 



J 



Europaisches Patentamt 
Eur pean Patent Offic 
Offic eur peen des br v ts 



(11) 



EP1 108 790 A2 



(12) 



EUROPEAN PATENT APPLICATION 



(43) Date of publication: 

20.06.2001 Bulletin 2001/25 

(21) Application number: 00127688.0 

(22) Dateof filing: 18.12.2000 



(51) intci7: C12Q 1/68, C07H 21/04, 
C12N 15/63, C07K 14/34, 
C12R 1/15, G06F 17/00, 
C12R1/13, G01N 33/50 



(84) Designated Contracting States: 

AT BE CH CY DE DK ES Fl FR GB GR IE IT LI LU 
MC NL PT SE TR 
Designated Extension States: 
ALLTLVMKROSI 

(30) Priority: 16.12.1999 JP 37748499 

07.04.2000 JP 2000159162 
03.08.2000 JP 2000280988 

(83) Declaration under Rule 28(4) EPC (expert 
solution) 

(71) Applicant: KYOWA HAKKO KOGYO CO., LTD. 
Chiyoda-ku, Tokyo 100-8185 (JP) 

(72) Inventors: 

• Nakagawa, Satochi, 

c/o Kyowa Hakko Kogyo Co., Ltd. 
Machida-shi, Tokyo 194-8533 (JP) 

• Mizoguchi, Hiroshi, 

c/o Kyowa Hakko Kogyo Co.,Ltd. 
Machida-shi, Tokyo 194-8533 (JP) 



• Ando, Seiko, c/o Kyowa Hakko Kogyo Co., Ltd. 
Machida-shi, Tokyo 194-8533 (JP) 

• Hayashi, Mikiro, 

c/o Kyowa Hakko Kogyo Co., Ltd. 
Machida-shi, Tokyo 194-8533 (JP) 

• Ochiai, Keiko, c/o Kyowa Hakko Kogyo Co.,Ltd. 
Machida-shi, Tokyo 194-8533 (JP) 

• Yokoi, Haruhiko, 

c/o Kyowa Hakko Kogyo Co., Ltd. 
Machida-shi, Tokyo 194-8533 (JP) 

• Tateishi, Naoko, 

c/o Kyowa Hakko Kogyo Co., Ltd. 
Machida-shi, Tokyo 194-8533 (JP) 

• Senoh, Akihiro, c/o Kyowa Hakko Kogyo Co.,Ltd. 
Machida-shi, Tokyo 194-8533 (JP) 

• Ikeda, Masato, c/o Kyowa Hakko Kogyo Co.,Ltd. 
Machida-shi, Tokyo 1 94-8533 (JP) 

• Ozaki, Akio, c/o Kyowa Hakko Kogyo Co., Ltd. 
Hofu-shi, Yamaguchi 747-8522 (JP) 

(74) Representative: VOSSIUS & PARTNER 
Slebertstrasse 4 
81675 Munchen (DE) 



(54) Novel polynucleotides 

(57) Novel polynucleotides derived from microor- 
ganisms belonging to coryneform bacteria and frag- 
ments thereof, polypeptides encoded by the polynucle- 
otides and fragments thereof, polynucleotide arrays 



comprising the polynucleotides and fragments thereof, 
recording media in which the nucleotide sequences of 
the polynucleotide and fragments thereof have been re- 
corded which are readable in a computer, and use of 
them. 



CM 
< 
O 

1^ 

00 



Q. 
LU 



Printed by Jouve. 75001 PARIS (FR) 



EP1 108 790 A2 



Descripti n 

BACKGROUND OF THE INVENTION 
5 1. Field of the Invention 

[0001 ] The present invention relates to novel polynucleotides derived from microorganisms belonging to coryneform 
bacteria and fragments thereof, polypeptides encoded by the polynucleotides and fragments thereof, polynucleotide 
arrays comprising the polynucleotides and fragments thereof, computer readable recording media in which the nucle- 
10 otide sequences of the polynucleotide and fragments thereof have been recorded, and use of them as well as a method 
of using the polynucleotide and/or polypeptide sequence information to make comparisons. 

2. Brief Description of the Background Art 

15 [0002] Coryneform bacteria are used in producing various useful substances, such as amino acids, nucleic acids, 
vitamins, saccharides (for example, ribulose), organic acids (for example, pyruvic acid), and analogues of the above- 
described substances (for example, N-acetylamino acids) and are very useful microorganisms industrially. Many mu- 
tants thereof are known. 

[0003] For example, Corynebacterium glutamicum is a Gram-positive bacterium identified as a glutamic acid-pro- 
20 ducing bacterium, and many amino acids are produced by mutants thereof. For example. 1 ,000,000 ton/year of L- 
glutamic acid which is useful as a seasoning for umami (delicious taste), 250,000 ton/year of L-lysine which is a valuable 
additive for livestock feeds and the like, and several hundred ton/year or more of other amino acids, such as L-arginine, 
L-proline, L-glutamine, L-tryptophan, and the like, have been produced in the world {Nikkei Bio Yearbook 99, published 
by Nikkei BP (1998)). 

25 [0004] The production of amino acids by Corynebacterium glutamicum is mainly carried out by its mutants (metabolic 
mutants) which have a mutated metabolic pathway and regulatory systems. In general, an organism is provided with 
various metabolic regulatory systems so as not to produce more amino acids than it needs. In the biosynthesis of L- 
lysine. for example, a microorganism belonging to the genus Corynebacterium is under such regulation as preventing 
the excessive production by concerted inhibition by lysine and threonine against the activity of a biosynthesis enzyme 

30 common to lysine, threonine and methionine, i.e., an aspartokinase, (J. Biochem., 65: 849-859 (1969)). The biosyn- 
thesis of arginine is controlled by repressing the expression of its biosynthesis gene by arginine so as not to biosyn- 
thesize an excessive amount of arginine (Microbiology, 142. 99-108 (1996)). It is considered that these metabolic 
regulatory mechanisms are deregulated in amino acid-producing mutants. Similarly, the metabolic regulation is dereg- 
ulated in mutants producing nucleic acids, vitamins, saccharides, organic acids and analogues of the above-described 

35 substances so as to improve the productivity of the objective product. 

[0005] However, accumulation of basic genetic, biochemical and molecular biological data on coryneform bacteria 
is insufficient in comparison with Escherichia coli, Bacilius subtiiis, and the like. Also, few findings have been obtained 
on mutated genes in amino acid-producing mutants. Thus, there are various mechanisms, which are still unknown, of 
regulating the growth and metabolism of these microorganisms. 

40 [0006] A chromosomal physical map of Corynebacterium glutamicum ATCC 13032 is reported and it is known that 
its genome size is about 3, 1 00 kb (Mo/. Gen. Genet.. 252: 255-265 (1 996)). Calculating on the basis of the usual gene 
density of bacteria, it is presumed that about 3,000 genes are present in this genome of about 3,100 kb. However, only 
about 100 genes mainly concerning amino acid biosynthesis genes are known in Corynebacterium glutamicum, and 
the nucleotide sequences of most genes have not been clarified hitherto. 

45 [0007] In recent years, the full nucleotide sequence of the genomes of several microorganisms, such as Escherichia 
coli, Mycobacterium tuberculosis, yeast, and the like, have been determined (Science, 277: 1453-62 (1997); Nature, 
393: 537-544 (1 998); Nature, 387: 5-1 05 (1 997)). Based on the thus determined full nucleotide sequences, assumption 
of gene regions and prediction of their function by comparison with the nucleotide sequences of known genes have 
been carried out. Thus, the functions of a great number of genes have been presumed, without genetic, biochemical 

50 or molecular biological experiments. 

[0008] In recent years, moreover, techniques for monitoring expression levels of a great number of genes simulta- 
neously or detecting mutations, using DNA chips, DNA arrays or the like In which a partial nucleic acid fragment of a 
gene or a partial nucleic acid fragment in genomic DNA other than a gene is fixed to a solid support, have been 
developed. The techniques contribute to the analysis of microorganisms, such as yeasts, Mycobacterium tuberculosis, 

55 Mycobacterium tows used in BCG vaccines, and the like {Science, 278: 680-686 (1997); Proc. Natl. Acad. Sci. USA, 
9e. 12833-38 (1999); Science, 284: 1520-23 (1999)). 
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SUMMARY OF THE INVENTION 

[0009] An object of the present invention is to provide a polynucleotide and a polypeptide derived from a microor- 
ganism of coryneform bacteria which are industrially useful, sequence information of the polynucleotide and the 
5 polypeptide, a method for analyzing the microorganism, an apparatus and a system for use in the analysis, and a 
method for breeding the microorganism. 

[0010] The present invention provides a polynucleotide and an oligonucleotide derived from a microorganism be- 
longing to coryneform bacteria, oligonucleotide arrays to which the polynucleotides and the oligonucleotides are fixed, 
a polypeptide encoded by the polynucleotide, an antibody which recognizes the polypeptide, polypeptide arrays to 
'fo which the polypeptides or the antibodies are fixed, a computer readable recording medium in which the nucleotide 
sequences of the polynucleotide and the oligonucleotide and the amino acid sequence of the polypeptide have been 
recorded, and a system based on the computer using the recording medium as well as a method of using the polynu- 
cleotide and/or polypeptide sequence information to mal<e comparisons. 

15 BRIEF DESCRIPTION OF THE DRAWING 

[001 1] Fig. 1 is a map showing the positions of typical genes on the genome of Corynebacterium glutamicum ATGO 
13032. 

[0012] Fig. 2 is electrophoresis showing the results of proteome analyses using proteins derived from (A) Coryne- 
20 bacterium glutamicum K[ 00 13032, (B) FERM BP-7134, and (G) FERM BP-158. 

[0013] Fig. 3 is a flow chart of an example of a system using the computer readable media according to the present 
invention. 

[0014] Fig. 4 is a flow chart of an example of a system using the computer readable media according to the present 
invention. 

25 

DETAILED DESCRIPTION OF THE INVENTION 

[0015] This application is based on Japanese applications No. Hei. 11-377484 filed on December 16, 1999, No. 
2000-159162 filed on April 7, 2000 and No. 2000-280988 filed on August 3, 2000, the entire contents of which are 

30 incorporated hereinto by reference. 

[0016] From the viewpoint that the determination of the full nucleotide sequence of Corynebacterium glutamicum 
would make it possible to specify gene regions which had not been previously identified, to determine the function of 
an unknown gene derived from the microorganism through comparison with nucleotide sequences of known genes 
and amino acid sequences of known genes, and to obtain a usefu! mutant based on the presumption of the metabolic* 

35 regulatory mechanism of a useful product by the microorganism, the inventors conducted intensive studies and, as a 
result, found that the complete genome sequence of Corynebacterium glutamicum can be determined by applying the 
whole genome shotgun method. 

[0017] Specifically, the present Invention relates to the following (1) to (65): 
40 (1) A method for at least one of the following: 

(A) identifying a mutation point of a gene derived from a mutant of a coryneform bacterium, 

(B) measuring an expression amount of a gene derived from a coryneform bacterium, 

(C) analyzing an expression profile of a gene derived from a coryneform bacterium, 
45 (D) analyzing expression patterns of genes derived from a coryneform bacterium, or 

(E) identifying a gene homologous to a gene derived from a coryneform bacterium, 
said method comprising: 

(a) producing a polynucleotide array by adhering to a solid support at least two polynucleotides selected 
50 from the group consisting of first polynucleotides comprising the nucleotide sequence represented by any 

one of SEQ ID NOSrI to 3501 , second polynucleotides which hybridize with the first polynucleotides under 
stringent conditions, and third polynucleotides comprising a sequence of 10 to 200 continuous bases of 
the first or second polynucleotides, 

(b) incubating the polynucleotide array with at least one of a labeled polynucleotide derived from a co- 
55 ryneform bacterium, a labeled polynucleotide derived from a mutant of the coryneform bacterium or a 

labeled polynucleotide to be examined, under hybridization conditions, 

(c) detecting any hybridization, and 

(d) analyzing the result of the hybridization. 
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As used herein, for example, the at least two polynucleotides can be at least two of the first polynu- 
cleotides, at least two of the second polynucleotides, at least two of the third polynucleotides, or at least 
two of the first, second and third polynucleotides. 

5 (2) The method according to (1), wherein the coryneform bacterium is a microorganism belonging to the genus 

Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

(3) The method according to (2), wherein the microorganism belonging to the genus Corynebacten'um is selected 
from the group consisting of Corynebacten'um glutamicum, Corynebacten'um acetoacidophilum, Corynebacterium 
acetoglutamicum, Corynebacterium callunae, Corynebacterium herculis, Corynebacterium iilium, Corynebacteri- 

10 um melassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(4) The method according to (1), wherein the polynucleotide derived from a coryneform bacterium, the polynuce- 
lotlde derived from a mutant of the coryneform bacterium or the polynucleotide to be examined is a gene relating 
to the biosynthesis of at least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, 
an organic acid, and analogues thereof. 

15 (5) The method according to (1), wherein the polynucleotide to be examined is derived from Escherichia coli. 

(6) A polynucleotide array, comprising: 



at least two polynucleotides selected from the group consisting of first polynucleotides comprising the nucle- 
otide sequence represented by any one of SEQ ID NOS:1 to 3501, second polynucleotides which hybridize 
20 with the first polynucleotides under stringent conditions, and third polynucleotides comprising 10 to 200 con- 

tinuous bases of the first or second polynucleotides, and 
a solid support adhered thereto. 



As used herein, for example, the at least two polynucleotides can be at least two of the first polynucleotides, 
25 at least two of the second polynucleotides, at least two of the third polynucleotides, or at least two of the first, 

second and third polynucleotides. 

(7) A polynucleotide comprising the nucleotide sequence represented by SEQ ID NO:1 or a polynucleotide having 
a homology of at least 80% with the polynucleotide. 

(8) A polynucleotide comprising any one of the nucleotide sequences represented by SEQ ID NOS:2 to 3431 , or 
30 a polynucleotide which hybridizes with the polynucleotide under stringent conditions. 

(9) A polynucleotide encoding a polypeptide having any one of the amino acid sequences represented by SEQ ID 
NOS:3502 to 6931 . or a polynucleotide which hybridizes therewith under stringent conditions. 

(10) A polynucleotide which is present in the 5' upstream or 3' downstream of a polynucleotide comprising the 
nucleotide sequence of any one of SEQ ID NOS:2 to 3431 in a whole polynucleotide comprising the nucleotide 

35 sequence represented by SEQ ID NO:1 , and has an activity of regulating an expression of the polynucleotide. 

(11) A polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequence of the polynucleotide of 
any one of (7) to (10), or a polynucleotide comprising a nucleotide sequence complementary to the polynucleotide 
comprising 10 to 200 continuous based. 

(12) A recombinant DNA comprising the polynucleotide of any one of (8) to (11 ). 

40 (1 3) A transformant comprising the polynucleotide of any one of (8) to (11 ) or the recombinant DNA of (1 2). 

(14) A method for producing a polypeptide, comprising: 

culturing the transformant of (13) in a medium to produce and accumulate a polypeptide encoded by the 
polynucleotide of (8) or (9) in the medium, and 
45 recovering the polypeptide from the medium. 

(1 5) A method for producing at least one of an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, 
and analogues thereof, comprising: 

50 culturing the transformant of (13) in a medium to produce and accumulate at least one of an amino acid, a 

nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof in the medium, and 
recovering the at least one of the amino acid, the nucleic acid, the vitamin, the saccharide, the organic acid, 
and analogues thereof from the medium. 

55 (16) A polypeptide encoded by a polynucleotide comprising the nucleotide sequence selected from SEQ ID NOS: 

2 to 3431 . 

(17) A polypeptide comprising the amino acid sequence selected from SEQ ID NOS:3502 to 6931 . 

(18) The polypeptide according to (16) or (17), wherein at least one amino acid is deleted, replaced, inserted or 
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added, said polypeptides having an activity which is substantially the same as that of the polypeptide without said 
at least one amino acid deletion, replacement, insertion or addition. 

(19) A polypeptide comprising an amino acid sequence having a homology of at least 60% with the amino acid 
sequence of the polypeptide of (16) or (17), and having an activity which is substantially the same as that of the 

5 polypeptide. 

(20) An antibody which recognizes the polypeptide of any one of (16) to (19). 

(21) A polypeptide array, comprising: 

at least one polypeptide or partial fragment polypeptide selected from the polypeptides of (16) to (19) and 
10 partial fragment polypeptides of the polypeptides, and 

a solid support adhered thereto. 

(22) A polypeptide array, comprising: 

^5 at least one antibody which recognizes a polypeptide or partial fragment polypeptide selected from the polypep- 

tides of (16) to (19) and partial fragment polypeptides of the polypeptides, and 
a solid support adhered thereto. 

(23) A system based on a computer for identifying a target sequence or a target structure motif derived from a 
20 coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one nucleotide sequence information selected from SEQ ID NOS:1 
to 3501 , and target sequence or target structure motif information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one nucleotide sequence information selected from SEQ ID NOS: 
1 to 3501 with the target sequence or target structure motif information, recorded by the data storage device 
for screening and analyzing nucleotide sequence information which is coincident with or analogous to the 
target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

(24) A method based on a computer for identifying a target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) inputting at least one nucleotide sequence information selected from SEQ ID NCS:1 to 3501, target se- 
quence information or target structure motif information into a user input device; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one nucleotide sequence information selected from SEQ ID NOS:1 to 3501 with 
the target sequence or target structure motif information; and 

(iv) screening and analyzing nucleotide sequence information which is coincident with or analogous to the 
target sequence or target structure motif information. 

(25) A system based on a computer for identifying a target sequence or a target structure motif derived from a 
coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 , and target sequence or target structure motif information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 with the target sequence or target structure motif information, recorded by the data storage 
device for screening and analyzing amino acid sequence information which is coincident with or analogous to 
the target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

(26) A method based on a computer for identifying a target sequence or a target structure motif derived from a 
55 coryneform bacterium, comprising the following: 

(i) inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 , and target 
sequence information or target structure motif information into a user input device; 
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(ii) at least temporarity storing said information; 

(iii) comparing the at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 
with the target sequence or target structure motif information; and 

(iv) screening and analyzing amino acid sequence information which is coincident with or analogous to the 
target sequence or target structure motif information. 

(27) A system based on a computer for determining a function of a polypeptide encoded by a polynucleotide having 
a target nucleotide sequence derived from a coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one nucleotide sequence information selected from SEQ ID NOS:2 
to 3501 , function information of a polypeptide encoded by the nucleotide sequence, and target nucleotide 
sequence information; 

(Ii) a data storage device for at least temporarily storing the input information; 

(Iii) a comparator that compares the at least one nucleotide sequence information selected from SEQ ID NOS: 
2 to 3501 with the target nucleotide sequence information, and determining a function of a polypeptide encoded 
by a polynucleotide having the target nucleotide sequence which is coincident with or analogous to the poly- 
nucleotide having at least one nucleotide sequence selected from SEQ ID NOS:2 to 3501; and 
(Iv) an output devices that shows a function obtained by the comparator 

(28) A method based on a computer for determining a function of a polypeptide encoded by a polypeptide encoded 
by a polynucleotide having a target nucleotide sequence derived from a coryneform bacterium, comprising the 
following: 

(i) Inputting at least one nucleotide sequence information selected from SEQ ID NOS:2 to 3501, function in- 
formation of a polypeptide encoded by the nucleotide sequence, and target nucleotide sequence information; 

(ii) at least temporarily storing said Information; 

(iii) comparing the at least one nucleotide sequence information selected from SEQ ID NOS:2 to 3501 with 
the target nucleotide sequence information; and 

(iv) determining a function of a polypeptide encoded by a polynucleotide having the target nucleotide sequence 
which is coincident with or analogous to the polynucleotide having at least one nucleotide sequence selected 
from SEQ ID NOS:2 to 3501. 

(29) A system based on a computer for determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

(I) a user input device that inputs at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 , function information based on the amino acid sequence, and target amino acid sequence infor- 
mation; 

(ii) a data storing device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 with the target amino acid sequence information for determining a function of a polypeptide 
having the target amino acid sequence which is coincident with or analogous to the polypeptide having at least 
one amino acid sequence selected from SEQ ID NOS:3502 to 7001; and 

(iv) an output device that shows a function obtained by the comparator. 

(30) A method based on a computer for determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

(i) Inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001, function 
information based on the amino acid sequence, and target amino acid sequence information; 

(ii) at least temporarily storing said Information; 

(iii) comparing the at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 
with the target amino acid sequence information; and 

(iv) determining a function of a polypeptide having the target amino acid sequence which is coincident with or 
analogous to the polypeptide having at least one amino acid sequence selected from SEQ ID NOS:3502 to 
7001. 

(31) The system according to any one of (23), (25), (27) and (29), wherein a coryneform bacterium is a microor- 
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ganism of the genus Corynebacterium, the genus Brevibacterium, or the genus Microbacterium, 

(32) The method according to any one of (24), (26), (28) and (30), wherein a coryneform bacterium is a microor- 
ganism of the genus Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

(33) The system according to (31 ), wherein the microorganism belonging to the genus Corynebacterium is selected 
5 from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, Corynebacterium 

acetoglutamicum, corynebacterium callunae, corynebacterium hercuiis, Corynebacterium iilium, Corynebacterium 
melassecoia, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(34) The method according to (32), wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, Corynebacterium 

10 acetoglutamicum, Corynebacterium callunae, Corynebacterium hercuiis, Corynebacterium Iilium, Corynebacteri- 

um melassecoia, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(35) A recording medium or storage device which is readable by a computer in which at least one nucleotide 
sequence information selected from SEQ ID NOS:1 to 3501 or function information based on the nucleotide se- 
quence is recorded, and is usable in the system of (23) or (27) or the method of (24) or (28). 

15 (36) A recording medium or storage device which is readable by a computer in which at least one amino acid 

sequence information selected from SEQ ID NOS:3502 to 7001 or function information based on the amino acid 
sequence is recorded, and is usable in the system of (25) or (29) or the method of (26) or (30). 

(37) The recording medium or storage device according to 

(35) or (36), which is a computer readable recording medium selected from the group consisting of a floppy disc, 
20 a hard disc, a magnetic tape, a random access memory (RAM), a read only memory (ROM), a magneto-optic disc 

(MO). CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM and DVD-RW. 

(38) A polypeptide having a homoserine dehydrogenase activity, comprising an amino acid sequence in which the 
Val residue at the 59th in the amino acid sequence of homoserine dehydrogenase derived from a coryneform 
bacterium is replaced with an amino acid residue other than a Val residue. 

25 (39) A polypeptide comprising an amino acid sequence in which the Val residue at the 59th position in the amino 

acid sequence as represented by SEQ ID NO:6952 Is replaced with an amino acid residue other than a Val residue. 

(40) The polypeptide according to (38) or (39), wherein the Val residue at the 59th position is replaced with an Ala 
residue. 

(41) A polypeptide having pyruvate carboxylase activity, comprising an amino acid sequence in which the Pro 
30 residue at the 458th position in the amino acid sequence of pyruvate carboxylase derived from a coryneform 

bacterium is replaced with an amino acid residue other than a Pro residue. 

(42) A polypeptide comprising an amino acid sequence in which the Pro residue at the 458th position in the amino 
acid sequence represented by SEQ ID NO:4265 is replaced with an amino acid residue other than a Pro residue. 

(43) The polypeptide according to (41) or (42), wherein the Pro residue at the 45Sth position is replaced v/ith a Ser 
35 residue. 

(44) The polypeptide according to any one of (38) to (43), which is derived from Corynebacterium glutamicum. 

(45) A DNA encoding the polypeptide of any one of (38) to (44). 

(46) A recombinant DNA comprising the DNA of (45). 

(47) A transformant comprising the recombinant DNA of (46). 

40 (48) A transformant comprising in its chromosome the DNA of (45). 

(49) The transformant according to (47) or (48), which is derived from a coryneform bacterium. 

(50) The transformant according to (49), which is derived from Corynebacterium glutamicum. 

(51 ) A method for producing L-lysine, comprising: 

45 culturing the transformant of any one of (47) to (50) in a medium to produce and accumulate L-lysine in the 

medium, and 

recovering the L-lysine from the culture. 

(52) A method for breeding a coryneform bacterium using the nucleotide sequence information represented by 
50 SEQ ID NOS:1 to 3431, comprising the following: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 

55 method, with a corresponding nucleotide sequence in SEQ ID N0S:1 to 3431 ; 

(ii) identifying a mutation point present in the production strain based on a result obtained by (i); 

(iii) introducing the mutation point into a coryneform bacterium which is free of the mutation point; and 

(iv) examining productivity by the fermentation method of the compound selected in (i) of the coryneform 
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bacterium obtained in (iii). 

(53) The method according to (52), wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
a signal transmission pathway. 

5 (54) The method according to (52), wherein the mutation point is a mutation point relating to a useful mutation 

which improves or stabilizes the productivity. 

(55) A method for breading a coryneform bacterium using the nucleotide sequence information represented by 
SEQ ID NOS:1 to 3431, comprising: 

10 (i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 

rium which has been subjected to mutation breeding so as to produce at least one compound selected from 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 
method, with a corresponding nucleotide sequence in SEQ ID NOS:1 to 3431 ; 
(ii) identifying a mutation point present in the production strain based on a result obtain by (i); 

15 (iii) deleting a mutation point from a coryneform bacterium having the mutation point; and 

(iv) examining productivity by the fermentation method of the compound selected in (i) of the coryneform 
bacterium obtained in (iii). 

(56) The method according to (55), wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
20 a signal transmission pathway. 

(57) The method according to (55), wherein the mutation point is a mutation point which decreases or destabilizes 
the productivity. 

(58) A method for breeding a coryrieform bacterium using the nucleotide sequence information represented by 
SEQ ID NOS:2 to 3431 , comprising the following: 

25 

(1) identifying an isozyme relating to biosynthesis of at least one compound selected from an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof, based on the nucleotide se- 
quence information represented by SEQ ID NOS:2 to 3431; 

(Ii) classifying the isozyme identified in (I) into an isozyme having the same activity; 
30 (iii) mutating all genes encoding the isozyme having the same activity simultaneously; and 

(iv) examining productivity by a fermentation method of the compound selected in (I) of the coryneform bac- 
terium which have been transformed with the gene obtained in (iii). 

(59) A method for breeding a coryneform bacterium using the nucleotide sequence information represented by 
35 SEQ ID NOS:2 to 3431 , comprising the following: 

(I) arranging a function Information of an open reading frame (ORF) represented by SEQ ID NOS:2 to 3431; 
(Ii) allowing the arranged ORF to correspond to an enzyme on a known biosynthesis or signal transmission 
pathway; 

40 (iii) explicating an unknown biosynthesis pathway or signal transmission pathway of a coryneform bacterium 

in combination with information relating known biosynthesis pathway or signal transmission pathway of a co- 
ryneform bacterium; 

(iv) comparing the pathway explicated in (iii) with a biosynthesis pathway of a target useful product; and 

(v) transgenetically varying a coryneform bacterium based on the nucleotide sequence information to either 
45 Strengthen a pathway which is judged to be important in the biosynthesis of the target useful product in (iv) or 

weaken a pathway which is judged not to be important in the biosynthesis of the target useful product in (iv). 

(60) A coryneform bacterium, bred by the method of any one of (52) to (59). 

(61) The coryneform bacterium according to (60), which is a microorganism belonging to the genus Corynebac- 
50 terium, the genus Brevibacterium, or the genus Microbacterium, 

(62) The coryneform bacterium according to (61 ), wherein the microorganism belonging to the genus Corynebac- 
terium is selected from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, 
Corynebacterium acetoglutamicum, Corynebacterium callunae, Corynebacterium herculis, Corynebacterium HI- 
ium, Corynebacterium melassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

55 (63) A method for producing at least one compound selected from an amino acid, a nucleic acid, a vitamin, a 

saccharide, an organic acid and an analogue thereof, comprising: 

culturing a coryneform bacterium of any one of (60) to (62) in a medium to produce and accumulate at least 
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one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and 
analogues thereof; 

recovering the compound from the culture. 

(64) The method according to (63), wherein the compound is L-lysine. 

(65) A method for identifying a protein relating to useful mutation based on proteome analysis, comprising the 
following: 

(I) preparing 

a protein derived from a bacterium of a production strain of a coryneform bacterium which has been sub- 
jected to mutation breeding by a fermentation process so as to produce at least one compound selected 
from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof, and 
a protein derived from a bacterium of a parent strain of the production strain; 

(ii) separating the proteins prepared in (i) by two dimensional electrophoresis; 

(iii) detecting the separated proteins, and comparing an expression amount of the protein derived from the 
production strain with that derived from the parent strain; 

(iv) treating the protein showing different expression amounts as a result of the comparison with a peptidase 
to extract peptide fragments; 

(v) analyzing amino acid sequences of the peptide fragments obtained in (iv); and 

(vi) comparing the amino acid sequences obtained in (v) with the amino acid sequence represented by SEQ 
ID NOS:3502 to 7001 to identifying the protein having the amino acid sequences. 

As used herein, the term "proteome", which is a coined word by combining "protein" with "genome", refers to 
a method for examining of a gene at the polypeptide level. 

(66) The method according to (65), wherein the coryneform bacterium is a microorganism belonging to the genus 
Corynebacten'um, the genus Brevibactehum, or the genus Microbactenum. 

(67) The method according to (66), wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, Corynebacterium 
acetog/utamicum, Corynebacterium cai/unae, corynebacterium herculis, Corynebacterium iiiium Corynebacterium 
melassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes. 

(68) A biologically pure culture of Corynebacterium glutamicum AHP-3 (PERM BP-7382). 

[001 8] The present invention will be described below in more detail, based on the determination of the full nucleotide 
sequence of coryneform bacteria. 

1 . Determination of full nucleotide sequence of coryneform bacteria 

[0019] The term "coryneform bacteria" as used herein means a microorganism belonging to the genus Corynebac- 
terium, the genus Brevibacterium or the genus Microbacterium as defined in Bergeys Manual of Determinative Bacte- 
riology, 5^2 027 A). 

[0020] Examples include Corynebacterium acetoacidophilum, Corynebacterium acetoglutamicum, Corynebacterium 
callunae, Corynebacterium glutamicum, Corynebacterium herculis, Corynebacterium lilium, Corynebacterium melas- 
secola, Corynebacterium thermoaminogenes, Brevibacterium sacctiarolyticum, Brevibacterium immariophilum, Brevi- 
bacterium roseum, Brevibacterium thiogenitalis, Microbacterium ammoniaphilum, and the like. 
[0021 ] Specific examples include Corynebacterium acetoacidophilum ATCC 1 3870, Corynebacterium acetoglutami- 
cumAlCC 15806, Corynebacterium callunae ATCC 15991, Corynebacterium glutamicum ACCCC 13032, Corynebac- 
terium glutamicum ATCC 1 3060, Corynebacterium glutamicum ATCC 1 3826 (prior genus and species: Brevibacterium 
flavum, or Corynebacterium lactofermentum), Corynebacterium glutamicum PJOC 14020 (prior genus and species: 
Brevibacterium divaricatum), Corynebacterium glutamicum ATCC 1 3869 (prior genus and species: Brevibacterium 
lactofermentum), Corynebacterium herculis ATCC 13868. Corynebacterium Iiiium ATCC 15990, Corynebacterium 
melassecola fiJCC 17965, Corynebacterium thermoaminogenes FERM 9244, Brevibacterium saccharolyticum ATCC 
14066, Brevibacterium immariophilum ATCC 14068, Brevibacterium roseum PJCC 13825, Brevibacterium thiogenitalis 
ATCC 19240, Microbacterium ammoniaphilum ATCC 15354, and the like. 
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(1) Preparation of genome DNA of coryneform bacteria 

[0022] Coryneform bacteria can be cultured by a conventional method. 

[0023] Any of a natural medium and a synthetic medium can be used, so long as it is a medium suitable for efficient 
5 culturing of the microorganism, and it contains a carbon source, a nitrogen source, an inorganic salt, and the like which 
can be assimilated by the microorganism. 

[0024] In Corynebacterium glutamicum, for example, a BY medium (7 g/l meat extract, 10 g/l peptone, 3 g/l sodium 
chloride, 5 g/l yeast extract, pH 7.2) containing 1% of glycine and the like can be used. The culturing is carried out at 
25 to 35**C overnight. 

10 [0025] After the completion of the culture, the cells are recovered from the culture by centrifugation. The resulting 
cells are washed with a washing solution, 

[0026] Examples of the washing solution include STE buffer (1 0.3% sucrose, 25 mmol/l Tris hydrochloride, 25 mmol/ 
I ethylenediaminetetraacetic acid (hereinafter referred to as "EDTA"), pH 8.0), and the like. 

[0027] Genome DNA can be obtained from the washed cells according to a conventional method for obtaining ge- 
15 nome DNA, namely, lysing the cell wall of the cells using a lysozyme and a surfactant (SOS, etc.), eliminating proteins 
and the like using a phenol solution and a phenol/chloroform solution, and then precipitating the genome DNA with 
ethanol or the like. Specifically, the following method can be illustrated. 

[0028] The washed cells are suspended in a washing solution containing 5 to 20 mg/l lysozyme. After shaking, 5 to 
20% SDS is added to lyse the cells. In usual, shaking is gently performed at 25 to 40° C for 30 minutes to 2 hours. After 
20 shaking, the suspension is maintained at 60 to 70°C for 5 to 15 minutes for the lysis. 

[0029] After the lysis, the suspension is cooled to ordinary temperature, and 5 to 20 ml of Tris-neutralized phenol is 
added thereto, followed by gently shaking at room temperature for 15 to 45 minutes. 

[0030] After shaking, centrifugation (15,000 x g, 20 minutes, 20°C) is carried out to fractionate the aqueous layer. 
[0031] After performing extraction with phenol/chloroform and extraction with chloroform (twice) in the same manner, 

25 3 mol/l sodium acetate solution (pH 5.2) and isopropanol are added to the aqueous layer at 1/10 times volume and 2 
times volume, of the aqueous layer, respectively, followed by gently stirring to precipitate the genome DNA. 
[0032] The genome DNA is dissolved again in a buffer containing 0.01 to 0.04 mg/ml RNase. As an example of the 
buffer, TE buffer (10 mmol/l Tris hydrochloride, 1 mol/l EDTA, pH 8.0) can be used. After dissolving, the resultant 
solution is maintained at 25 to 40''C for 20 to 50 minutes and then extracted successively with phenol, phenol/chloroform 

30 and chloroform as in the above case. 

[0033] After the extraction, isopropanol precipitation is carried out and the resulting DNA precipitate is washed with 
70% ethanol, followed by air drying, and then dissolved in TE buffer to obtain a genome DNA solution. 

(2) Production of shotgun library 

35 

[0034] A method for produce a genome DNA library using the genome DNA of the coryneform bacteria prepared in 
the above (1 ) include a method described in Molecular Cloning, A laboratory Manual, Second Edition (1 989) (hereinafter 
referred to as "Molecular Cloning, 2nd ed."). In particular, the following method can be exemplified to prepare a genome 
DNA library appropriately usable in determining the full nucleotide sequence by the shotgun method. 
40 [0035] To 0.01 mg of the genome DNA of the coryneform bacteria prepared in the above (1) , a buffer, such as TE 
buffer or the like, is added to give a total volume of 0.4 ml. Then, the genome DNA is digested into fragments of 1 to 
10 kb with a sonicator (Yamato Powersonic Model 50). The treatment with the sonicator is performed at an output of 
20 continuously for 5 seconds. 

[0036] The resulting genome DNA fragments are blunt-ended using DNA blunting kit (manufactured by Takara Shuzo) 
45 or the like. 

[0037] The blunt-ended genome fragments are fractionated by agarose gel or polyacrylamide gel electrophoresis 
and genome fragments of 1 to 2 kb are cut out from the gel. 

[0038] To the gel, 0.2 to 0.5 ml of a buffer for eluting DNA, such as MG elution buffer (0.5 mol/l ammonium acetate, 
10 mmol/l magnesium acetate. 1 mmol/l EDTA, 0.1% SDS) or the like, is added, followed by shaking at 25 to 40°C 
50 overnight to elute DNA. 

[0039] The resulting DNA eluate is treated with phenol/chloroform and then precipitated with ethanol to obtain a 
genome library insert. 

[0040] This insert is ligated into a suitable vector, such as pUC18 Smal/SAP (manufactured by Amersham Pharmacia 
Biotech) or the like, using T4 ligase (manufactured by Takara Shuzo) or the like. The ligation can be carried out by 
55 allowing a mixture to stand at 1 0 to 20°C for 20 to 50 hours. 

[0041] The resulting ligation product is precipitated with ethanol and dissolved in 5 to 20 \i\ of TE buffer. 

[0042] Escherichia coli 'is transformed in accordance with a conventional method using 0.5 to 2 ^1 of the ligation 

solution. Examples of the transformation method include the electroporation method using ELECTRO MAX DHIOB 
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(manufactured by Life Technologies) for Escherichia coli. The electroporatron method can be carried out under the 
conditions as described in the manufacturer's Instructions. 

[0043] The transformed Escherichia coli'is spread on a suitable selection medium containing agar, for example, LB 
plate medium containing 10 to 100 mg/l ampicillln (LB medium (10 g/l bactotrypton, 5 g/l yeast extract, 10 g/l sodium 
5 chloride, pH 7.0) containing 1.6% of agar) when pUC18 is used as the cloning vector, and cultured therein. 

[0044] The transformant can be obtained as colonies formed on the plate medium. In this step, it is possible to select 
the transformant having the recombinant DNA containing the genome DNA as white colonies by adding X-gal and 
IPTG (isopropyl-p-thiogalactopyranoside) to the plate medium. 

[0045] The transformant is allowed to stand for culturing In a 96-well titer plate to which 0.05 ml of the LB medium 
10 containing 0.1 mg/ml of ampicillin has been added in each well. The resulting culture can be used in an experiment of 
(4) described below. Also, the culture solution can be stored at -80°C by adding 0.05 ml per well of the LB medium 
containing 20% glycerol to the culture solution, followed by mixing, and the stored culture solution can be used at any 
time. 

15 (3) Production of cosmid library 

[0046] The genome DNA (0.1 mg) of the coryneform bacteria prepared in the above (1) is partially digested with a 
restriction enzyme, such as Sau3A\ or the like, and then ultracentrifuged (26,000 rpm, 18 hours, 20^*0) under a 10 to 
40% sucrose density gradient using a 10% sucrose buffer (1 mol/l Nad, 20 mmol/l Tris hydrochloride, 5 mmol/l EDTA, 
20 10% sucrose, pH 8.0) and a 40% sucrose buffer (elevating the concentration of the 10% sucrose buffer to 40%). 

[0047] After the centrifugatlon, the thus separated solution Is fractionated into tubes in 1 ml per each tube. After 
confirming the DNA fragment size of each fraction by agarose gel electrophoresis, a fraction rich In DNA fragments of 
about 40 kb Is precipitated with ethanol. 

[0048] The resulting DNA fragment is ligated to a cosmid vector having a cohesive end which can be ligated to the 
25 fragment. When the genome DNA is partially digested with Sau3AI, the partially digested product can be ligated to, 
for example, the BamH\ site of superCosI (manufactured by Stratagene) in accordance with the manufacture's instruc- 
tions. 

[0049] The resulting ligation product is packaged using a packaging extract which can be prepared by a method 
described In Molecular Cloning, 2nd ed. and then used in transforming Escherichia coii. More specifically, the ligation 
30 product is packaged using, for example, a commercially available packaging extract, GIgapack III Gold Packaging 
Extract (manufactured by Stratagene) In accordance with the manufacture's Instructions and then introduced into Es- 
cherichia CO// XL-1-BlueMR (manufactured by Stratagene) or the like. 

[0050] The thus transformed Escherichia coii is spread on an LB plate medium containing ampicillin, and cultured 
therein. 

35 [0051] The transformant can be obtained as colonies formed on the plate medium. 

[0052] The transformant is subjected to standing culture In a 96-well titer plate to which 0.05 ml of the LB medium 
containing 0.1 mg/ml ampicillin has been added. 

[0053] The resulting culture can be employed in an experiment of (4) described below. Also, the culture solution can 
be stored at -80°C by adding 0.05 ml per well of the LB medium containing 20% glycerol to the culture solution, followed 
40 by mixing, and the stored culture solution can be used at any time. 

(4) Determination of nucleotide sequence 

(4-1) Preparation of template 

45 

[0054] The full nucleotide sequence of genome DNA of coryneform bacteria can be determined basically according 
to the whole genome shotgun method (Science, 269: 496-512 (1995)). 

[0055] The template used In the whole genome shotgun method can be prepared by PGR using the library prepared 
in the above (2) {DNA Research, 5: 1-9 (1998)). 
50 [0056] Specifically, the template can be prepared as follows. 

[0057] The clone derived from the whole genome shotgun library Is inoculated by using a replicator (manufactured 
by G EN ETIX) Into each well of a 96-well plate to which 0.08 ml per well of the LB medium containing 0. 1 mg/ml ampicillin 
has been added, followed by stationarily culturing at 37°C overnight 

[0058] Next, the culture solution is transported, using a copy plate (manufactured by Tokken), Into each well of a 
55 96-well reaction plate (manufactured by PE Blosystems) to which 0.025 ml per well of a PGR reaction solution has 
been added using TaKaRa Ex Taq (manufactured by Takara Shuzo). Then, PGR Is carried out In accordance with the 
protocol by Makino et al. (DNA Research, 5: 1-9 (1998)) using GeneAmp PGR System 9700 (manufactured by PE 
Biosystems) to amplify the inserted fragments. 
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[0059] The excessive primers and nucleotides are eliminated using a kit for purifying a PGR product, and the product 
Is used as the template in the sequencing reaction. 

[0060] It is also possible to determine the nucleotide sequence using a double-stranded DNA plasmid as a template. 

[0061] The double-stranded DNA plasmid used as the template can be obtained by the following method. 
s [0062] The clone derived from the whole genome shotgun library is inoculated into each well of a 24- or 96-well plate 

to which 1.5 ml per well of a 2 x YT medium (16 g/l bactotrypton, 10 g/l yeast extract, 5 g/l sodium chloride, pH 7.0) 

containing 0.05 mg/ml ampicillin has been added, followed by culturing under shaking at 37°C overnight. 

[0063] The double-stranded DNA plasmid can be prepared from the culture solution using an automatic plasmid 

preparing machine KURABO PI-50 (manufactured by Kurabo Industries), a multiscreen (manufactured by Millipore) 
10 or the like, according to each protocol. 

[0064] To purify the plasmid, Biomek 2000 manufactured by Beckman Coulter and the like can be used. 

[0065] The resulting purified double-stranded DNA plasmid is dissolved in water to give a concentration of about 0.1 

mg/ml. Then, it can be used as the template in sequencing. 

15 (4-2) Sequencing reaction 

[0066] The sequencing reaction can be carried out according to a commercially available sequence kit or the like. A 
specific method is exemplified below. 

[0067] To 6 III of a solution of ABi PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit (manufactured 
20 by PE Biosystems), 1 to 2 pmol of an M13 regular direction primer (M13-21) or an M13 reverse direction primer 
.(MI3REV) (DAM Research, 5: 1-9 (1998)) and 50 to 200 ng of the template prepared in the above (4-1) (the PCR 
product or plasmid) to give 1 0 jil of a sequencing reaction solution. 

[0068] A dye terminator sequencing reaction (35 to 55 cycles) is carried out using this reaction solution and GeneAmp 
PCR System 9700 (manufactured by PE Biosystems) or the like. The cycle parameter can be determined in accordance 
25 with a commercially available kit, for example, the manufacture's instructions attached with ABI PRISM Big Dye Ter- 
minator Cycle Sequencing Ready Reaction Kit. 

[0069] The sample can be purified using a commercially available product, such as Multi Screen HV plate (manu- 
factured by Millipore) or the like, according to the manufacture's instructions. 

[0070] The thus purified reaction product is precipitated with ethanol, dried and then used for the analysis. The dried 
30 reaction product can be stored in the dark at -30''C and the stored reaction product can be used at any time. 

[0071] The dried reaction product can be analyzed using a commercially available sequencer and an analyzer ac- 
cording to the manufacture's instructions. 

[0072] Examples of the commercially available sequencer include ABI PRISM 377 DNA Sequencer (manufactured 
by PE Biosystems). Example of the analyzer include ABI PRISM 3700 DNA Analyzer (manufactured by PE Biosystems). 

35 

(5) Assembly 

[0073] A software, such as phred (The University of Washington) or the like, can be used as base call for use in 
analyzing the sequence information obtained in the above (4). A software, such as Cross_Match (The University of 
40 Washington) or SPS Cross_Match (manufactured by Southwest Parallel Software) or the like, can be used to mask 
the vector sequence information. 

[0074] For the assembly, a software, such as phrap (The University of Washington), SPS phrap (manufactured by 
Southwest Parallel Software) or the like, can be used. 

[0075] In the above, analysis and output of the results thereof, a computer such as UNIX, PC, Macintosh, and the 
45 like can be used. 

[0076] Contig obtained by the assembly can be analyzed using a graphical editor such as consed (The University 
of Washington) or the like. 

[0077] It is also possible to perform a series of the operations from the base call to the assembly in a lump using a 
script phredPhrap attached to the consed. 
50 [0078] As used herein, software will be understood to also be referred to as a comparator. 

(6) Determination of nucleotide sequence in gap part 

[0079] Each of the cosmids in the cosmid library constructed in the above (3) is prepared in the same manner as in 
55 the preparation of the double-stranded DNA plasmid described in the above (4-1 ). The nucleotide sequence at the end 
of the insert fragment of the cosmid is determined using a commercially available kit, such as ABI PRISM BigDye 
Terminator Cycle Sequencing Ready Reaction Kit (manufactured by PE Biosystems) according to the manufacture's 
instructions. 
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[0080] About 800 cosmid clones are sequenced at both ends of the inserted fragment to detect a nucleotide sequence 
In the contig derived from the shotgun sequencing obtained In (5) which is coincident with the sequence. Thus, the 
chain linkage between respective cosmid clones and respective contigs are clarified, and mutual alignment is carried 
out. Furthermore, the results are compared with known physical maps to map the cosmids and the contigs. In case of 
Corynebacterium gfutamicum AJCC 13032, a physical map of Mol. Gen. Genet, 252: 255-265 (1996) can be used. 
[0081] The sequence In the region which cannot be covered with the contigs (gap part) can be determined by the 
following method. 

[0082] Clones containing sequences positioned at the ends of the contigs are selected. Among these, a clone wherein 
only one end of the inserted fragment has been determined Is selected and the sequence at the opposite end of the 
Inserted fragment Is determined. 

[0083] A shotgun library clone or a cosmid clone derived therefrom containing the sequences at the respective ends 
of the inserted fragments in the two contigs is identified and the full nucleotide sequence of the inserted fragment of 
the clone is determined. 

[0084] According to this method, the nucleotide sequence of the gap part can be determined. 
[0085] When no shotgun library clone or cosmid clone covering the gap part Is available, primers complementary to 
the end sequences of the two different contigs are prepared and the DNA fragment in the gap part Is amplified. Then, 
sequencing is performed by the primer walking method using the amplified DNA fragment as a template or by the 
shotgun method in which the sequence of a shotgun clone prepared from the amplified DNA fragment is determined. 
Thus, the nucleotide sequence of the above-described region can be determined. 

[0086] In a region showing a low sequence accuracy, primers are synthesized using AUTOFINISH function and 
N AVIG ATI NG function of consed (The University of Washington), and the sequence Is determined by the primer walking 
method to improve the sequence accuracy. 

[0087] Examples of the thus determined nucleotide sequence of the full genome include the full nucleotide sequence 
of genome of Corynebacterium giutamicum AVCC 13032 represented by SEQ ID NO:1. 

(7) Determination of nucleotide sequence of microorganism genome DNA using the nucleotide sequence represented 
by SEQ ID NO:1 

[0088] A nucleotide sequence of a polynucleotide having a homology of 80% or more with the full nucleotide sequence 
of Corynebacterium glutamicum ATCC 13032 represented by SEQ ID NO:1 as determined above can also be deter- 
mined using the nucleotide sequence represented by SEQ ID NO:1, and the polynucleotide having a nucleotide se- 
quence having a homology of 80% or more with the nucleotide sequence represented by SEQ ID NO:1 of the present 
invention is within the scope of the present invention. The term "polynucleotide having a nucleotide sequence having 
a homology of 80% or more with the nucleotide sequence represented by SEQ ID NG:1 of the present invention" is a 
polynucleotide In which a full nucleotide sequence of the chromosome DNA can be determined using as a primer an 
oligonucleotide composed of continuous 5 to 50 nucleotides in the nucleotide sequence represented by SEQ ID NO: 

1, for example, according to PGR using the chromosome DNA as a template. A particularly preferred primer in deter- 
mination of the full nucleotide sequence is an oligonucleotide having nucleotide sequences which are positioned at 
the interval of about 300 to 500 bp, and among such oligonucleotides, an oligonucleotide having a nucleotide sequence 
selected from DNAs encoding a protein relating to a main metabolic pathway Is particularly preferred. The polynucle- 
otide in which the full nucleotide sequence of the chromosome DNA can be determined using the oligonucleotide 
includes polynucleotides constituting a chromosome DNA derived from a microorganism belonging to coryneform bac- 
teria. Such a polynucleotide Is preferably a polynucleotide constituting chromosome DNA derived from a microorganism 
belonging to the genus Corynebacterium, more preferably a polynucleotide constituting a chromosome DNA of Co- 
rynebacterium gfutamicum. 

2. Identification of ORF (open reading frame) and expression regulatory fragment and determination of the function of 
ORF 

[0089] Based on the full nucleotide sequence data of the genome derived from coryneform bacteria determined in 
the above item 1 , an ORF and an expression modulating fragment can be Identified. Furthermore, the function of the 
thus determined ORF can be determined. 

[0090] The ORF means a continuous region in the nucleotide sequence of mRNA which can be translated as an 
amino acid sequence to mature to a protein. A region of the DNA coding for the ORF of mRNA is also called ORE 
[0091] The expression modulating fragment (hereinafter referred to as "EMF") is used herein to define a series of 
polynucleotide fragments which modulate the expression of the ORF or another sequence ligated operatably thereto. 
The expression "modulate the expression of a sequence ligated operatably" is used herein to refer to changes In the 
expression of a sequence due to the presence of the ElVIF. Examples of the EMF include a promoter, an operator, an 
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enhancer, a silencer, a ribosome-binding sequence, a transcriptional termination sequence, and the like. In coryneform 
bacteria, an EMF Is usually present in an intergenic segment (a fragment positioned between two genes; about 10 to 
200 nucleotides in length). Accordingly, an EMF is frequently present in an intergenic segment of 10 nucleotides or 
longer. It is also possible to determine or discover the presence of an EMF by using known EMF sequences as a target 
5 sequence or a target structural motif (or a target motif) using an appropriate software or comparator, such as FASTA 
{Proc. Natl. Acad. Sci. USA, 85: 2444-48 (1988)), BLAST (J. Mol. Biol., 275; 403-410 (1990)) or the like. Also, it can 
be identified and evaluated using a known EMF-capturing vector (for example, pKK232-8; manufactured by Amersham 
Pharmacia Biotech). 

[0092] The term "target sequence" is used herein to refer to a nucleotide sequence composed of 6 or more nucle- 
10 otides, an amino acid sequence composed of 2 or more amino acids, or a nucleotide sequence encoding this amino 
acid sequence composed of 2 or more amino acids. A longer target sequence appears at random in a data base at 
the lower possibility. The target sequence is preferably about 1 0 to 1 00 amino acid residues or about 30 to 300 nucle- 
otide residues. 

[0093] The term "target structural motif" or "target motif" is used herein to refer to a sequence or a combination of 
15 sequences selected optionally and reasonably. Such a motif is selected on the basis of the threedimensional structure 
formed by the folding of a polypeptide by means known to one of ordinary skill in the art. Various motives are known. 
[0094] Examples of the target motif of a polypeptide include, but are not limited to, an enzyme activity site, a protein- 
protein interaction site, a signal sequence, and the like. Examples of the target motif of a nucleic acid include a promoter 
sequence, a transcriptional regulatory factor binding sequence, a hair pin structure, and the like. 
20 [0095] Examples of highly useful EMF include a high-expression promoter, an inducible-expression promoter, and 
the like. Such an EMF can be obtained by positionally determining the nucleotide sequence of a gene which is known 
or expected as achieving high expression (for example, ribosomal RNA gene: GenBank Accession No. Ml 61 75 or 
Z46753) or a gene showing a desired induction pattern (for example, isocitrate lyase gene induced by acetic acid: 
Japanese Published Unexamined Patent Application No. 56782/93) via the alignment with the full genome nucleotide 
25 sequence determined in the above Item 1 , and isolating the genome fragment in the upstream part (usually 200 to 500 
nucleotides from the translation initiation site). It is also possible to obtain a highly useful EMF by selecting an EMF 
showing a high expression efficiency or a desired induction pattern from among promoters captured by the EMF- 
capturing vector as described above. 

[0096] The ORF can be identified by extracting characteristics common to individual ORFs, constructing a general 
30 model based on these characteristics, and measuring the conformity of the subject sequence with the model. In the 
identification, a software, such as GeneMark {Nuc. Acids. Res., 22: 4766-67 (1994): manufactured by GenePro)), 
GeneMark.hmm (manufactured by GenePro), GeneHacker {Protein, Nucleic Acid and Enzyme, 42. 3001-07 (1997)), 
Glimmer (Nuc. Acids. Res., 2e. 544-548 (1998): manufactured by The Institute of Genomic Research), or the like, can 
be used. In using the software, the default (initial setting) parameters are usually used, though the parameters can be 
35 optionally changed. 

[0097] In the above-described comparisons, a computer, such as UNIX, PC, Macintosh, or the like, can be used. 
[0098] Examples of the ORF determined by the method of the present invention include ORFs having the nucleotide 
sequences represented by SEQ ID NOS:2 to 3501 present in the genome of Corynebacterium glutamicum as repre- 
sented by SEQ ID NO:1 . In these ORFs, polypeptides having the amino acid sequences represented by SEQ ID NOS: 

40 3502 to 7001 are encoded. 

[0099] The function of an ORF can be determined by comparing the identified amino acid sequence of the ORF with 
known homologous sequences using a homology searching software or comparator, such as BLAST, FAST, Smith & 
Waterman (Metti. Enzym., 164: 765 (1988)) or the like on an amino acid data base, such as Swith-Prot, PIR, GenBank- 
nr-aa, GenPept constituted by protein-encoding domains derived from GenBank data base, OWL or the like. 

45 [0100] Furthermore, by the homology searching, the identity and similarity with the amino acid sequences of known 
proteins can also be analyzed. 

[0101] With respect of the term "identity" used herein, where two polypeptides each having 10 amino acids are 
different in the positions of 3 amino acids, these polypeptides have an identity of 70% with each other. In case wherein 
one of the different 3 amino acids is analogue (for example, leucine and isoleucine), these polypeptides have a similarity 
50 of 80%. 

[0102] As a specific example. Table 1 shows the registration numbers in known data bases of sequences which are 
judged as having the highest similarity with the nucleotide sequence of the ORF derived from Corynebacterium glutami- 
cum ATCC 13032, genes of these sequences, functions of these genes, and identities thereof compared with known 
amino acid translation sequences. 
55 [0103] Thus, a great number of novel genes derived from coryneform bacteria can be identified by determining the 
full nucleotide sequence of the genome derived from coryneform bacterium by the means of the present invention. 
Moreover, the function of the proteins encoded by these genes can be determined. Since coryneform bacteria are 
industrially highly useful microorganisms, many of the identified genes are industrially useful. 
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[0104] Moreover, the characteristics of respective microorganisms can be clarified by classifying the functions thus 
determined. As a result, valuable information in breeding is obtained. 

[0105] Furthermore, from the ORF information derived from coryneform bacteria, the ORF corresponding to the 
microorganism is prepared and obtained according to the general method as disclosed in Molecular Cloning, 2nd ed. 

5 or the like. Specifically, an oligonucleotide having a nucleotide sequence adjacent to the ORF is synthesized, and the 
ORF can be isolated and obtained using the oligonucleotide as a primer and a chromosome DNA derived from co- 
ryneform bacteria as a template according to the general PGR cloning technique. Thus obtained ORF sequences 
include polynucleotides comprising the nucleotide sequence represented by any one of SEQ ID NOS:2 to 3501. 
[0106] The ORF or primer can be prepared using a polypeptide synthesizer based on the above sequence informa- 

10 tion. 

[0107] Examples of the polynucleotide of the present invention include a polynucleotide containing the nucleotide 
sequence of the ORF obtained in the above, and a polynucleotide which hybridizes with the polynucleotide under 
stringent conditions. 

[0108] The polynucleotide of the present invention can be a single-stranded DNA, a double-stranded DNA and a 

15 single-Stranded RNA, though it is not limited thereto. 

[0109] The polynucleotide which hybridizes with the polynucleotide containing the nucleotide sequence of the ORF 
obtained in the above under stringent conditions includes a degenerated mutant of the ORF. A degenerated mutant is 
a polynucleotide fragment having a nucleotide sequence which is different from the sequence of the ORF of the present 
invention which encodes the same amino acid sequence by degeneracy of a gene code. 

20 [0110] Specific examples include a polynucleotide comprising the nucleotide sequence represented by any one of 
SEQ ID NOS:2 to 3431 , and a polynucleotide which hybridizes with the polynucleotide under stringent conditions. 
[011 1 ] A polynucleotide which hybridizes under stringent conditions is a polynucleotide obtained by colony hybridi- 
zation, plaque hybridization, Southern blot hybridization or the like using, as a probe, the polynucleotide having the 
nucleotide sequence of the ORF identified in the above. Specific examples Include a polynucleotide which can be 

25 identified by carrying out hybridization at 65°C in the presence of 0.7-1 .0 M NaCI using a filter on which a polynucleotide 
prepared from colonies or plaques is immobilized, and then washing the filter with 0.1 x to 2x SSC solution (the com- 
position of Ix SSC contains 150 mM sodium chloride and 15 mM sodium citrate) at Q&'O. 

[0112] The hybridization can be carried out in accordance with known methods described in, for example. Molecular 
Cloning, 2nd ed., Current Protocols in Molecular Biology, DNA Cloning 1: Core Techniques, A Practical Approach, 

30 Second Edition, Oxford University (1995) or the like. Specific examples of the polynucleotide which can be hybridized 
include a DNA having a homology of 60% or more, preferably 80% or more, and particularly preferably 95% or more, 
with the nucleotide sequence represented by any one of SEQ ID NO:2 to 3431 when calculated using default (initial 
setting) parameters of a homology searching software, such as BLAST, FASTA, Smith-Waterman or the like. 
[0113] Also, the polynucleotide of the present invention includes a polynucleotide encoding a polypeptide comprising 

35 the amino acid sequence represented by any one of SEQ ID NOS:3502 to 6931 and a polynucleotide which hybridizes 
with the polynucleotide under stringent conditions. 

[0114] Furthermore, the polynucleotide of the present invention includes a polynucleotide which is present in the 5' 
upstream or 3' downstream region of a polynucleotide comprising the nucleotide sequence of any one of SEQ ID NOS: 
2 to 3431 in a polynucleotide comprising the nucleotide sequence represented by SEQ ID NO:1, and has an activity 
40 of regulating an expression of a polypeptide encoded by the polynucleotide. Specific examples of the polynucleotide 
having an activity of regulating an expression of a polypeptide encoded by the polynucleotide includes a polynucleotide 
encoding the above described EMF, such as a promoter, an operator, an enhancer, a silencer, a ribosome-binding 
sequence, a transcriptional termination sequence, and the like. 

[0115] The primer used for obtaining the ORF according to the above PGR cloning technique includes an oligonu- 
45 cleotide comprising a sequence which is the same as a sequence of 1 0 to 200 continuous nucleotides in the nucleotide 
sequence of the ORF and an adjacent region or an oligonucleotide comprising a sequence which is complementary 
to the oligonucleotide. Specific examples include an oligonucleotide comprising a sequence which is the same as a 
sequence of 10 to 200 continuous nucleotides of the nucleotide sequence represented by any one of SEQ ID NOS:1 
to 3431 , and an oligonucleotide comprising a sequence complementary to the oligonucleotide comprising a sequence 
50 of at least 1 0 to 20 continuous nucleotide of any one of SEQ I D NOS:1 to 3431 . When the primers are used as a sense 
primer and an antisense primer, the above-described oligonucleotides in which melting temperature (Tj„) and the 
number of nucleotides are not significantly different from each other are preferred. 

[01 16] The oligonucleotide of the present invention includes an oligonucleotide comprising a sequence which is the 
same as 10 to 200 continuous nucleotides of the nucleotide sequence represented by any one of SEQ ID NOS:1 to 
55 3431 or an oligonucleotide comprising a sequence complementary to the oligonucleotide. 

[01 17] Also, analogues of these oligonucleotides (hereinafter also referred to as "analogous oligonucleotides") are 
also provided by the present invention and are useful in the methods described herein. 

[01 18] Examples of the analogous oligonucleotides include analogous oligonucleotides in which a phosphodiester 
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bond in an oligonucleotide is converted to a phosphorothioate bond, analogous oligonucleotides in which a phosphodi- 
ester bond in an oligonucleotide is converted to an N3'-P5' phosphoamidate bond, analogous oligonucleotides in which 
ribose and a phosphodiester bond in an oligonucleotide is converted to a peptide nucleic acid bond, analogous oligo- 
nucleotides in which uracil In an oligonucleotide is replaced with C-5 propynyluracil, analogous oligonucleotides in 
which uracil in an oligonucleotide is replaced with C-5 thiazoluracil, analogous oligonucleotides in which cytosine in 
an oligonucleotide Is replaced with C-5 propynylcytoslne, analogous oligonucleotides in which cytosine in an oligonu- 
cleotide is replaced with phenoxazine-nnodlfied cytosine, analogous oligonucleotides in which ribose In an oligonucle- 
otide is replaced with 2'-0-propylrlbose, analogous oligonucleotides In which ribose in an oligonucleotide is replaced 
with 2'-methoxyethoxyribose, and the like {Cell Engineering, 16: 1463 (1997)). 

[0119] The above oligonucleotides and analogous oligonucleotides of the present invention can be used as probes 
for hybridization and antlsense nucleic acids described below In addition to as primers. 

[0120] Examples of a primer for the antisense nucleic acid techniques known In the art include an oligonucleotide 
which hybridizes the oligonucleotide of the present Invention under stringent conditions and has an activity regulating 
expression of the polypeptide encoded by the polynucleotide, In addition to the above oligonucleotide. 

3. Determination of isozymes 

[0121] Many mutants of coryneform bacteria which are useful in the production of useful substances, such as amino 
acids, nucleic acids, vitamins, saccharides, organic acids, and the like, are obtained by the present Invention. 
[01 22] However, since the gene sequence data of the microorganism has been, to date, Insufficient, useful mutants 
have been obtained by mutagenic techniques using a mutagen, such as nitrosoguanldine (NTG) or the like, 
[0123] Although genes can be mutated randomly by the mutagenic method using the above-described mutagen, all 
genes encoding respective isozymes having similar properties relating to the metabolism of intermediates cannot be 
mutated. In the mutagenic method using a mutagen, genes are mutated randomly. Accordingly, harmful mutations 
worsening culture characteristics, such as delay In growth, accelerated foaming, and the like, might be imparted at a 
great frequency, in a random manner. 

[0124] However, if gene sequence Information is available, such as is provided by the present invention, it Is possible 
to mutate all of the genes encoding target isozymes. In this case, harmful mutations may be avoided and the target 
mutation can be incorporated. 

[0125] Namely, an accurate number and sequence information of the target isozymes In coryneform bacteria can be 
obtained based on the ORF data obtained in the above item 2. By using the sequence Information, all of the target 
isozyme genes can be mutated Into genes having the desired properties by, for example, the site-specific mutagenesis 
method described in Molecular Cloning, 2nd ed. to obtain useful mutants having elevated productivity of useful sub- 
stances. 

4. Clarification or determination of biosynthesis pathway and signal transmission pathway 

[01 26] Attempts have been made to elucidate biosynthesis pathways and signal transmission pathways In a number 
of organisms, and many findings have been reported. However, there are many unknown aspects of coryneform bac- 
teria since a number of genes have not been identified so far. 
[0127] These unknown points can be clarified by the following method. 

[01 28] The functional information of ORF derived from coryneform bacteria as identified by the method of above item 
2 Is arranged. The term "arranged" means that the ORF Is classified based on the biosynthesis pathway of a substance 
or the signal transmission pathway to which the ORF belongs using known Information according to the functional 
information. Next, the arranged ORF sequence information Is compared with enzymes on the biosynthesis pathways 
or signal transmission pathways of other known organisms. The resulting Information Is combined with known data on 
coryneform bacteria. Thus, the biosynthesis pathways and signal transmission pathways in coryneform bacteria, which 
have been unknown so far, can be determined. 

[0129] As a result that these pathways which have been unknown or unclear hitherto are clarified, a useful mutant 
for producing a target useful substance can be efficiently obtained. 

[0130] When the thus clarified pathway is judged as Important in the synthesis of a useful product, a useful mutant 
can be obtained by selecting a mutant wherein this pathway has been strengthened. Also, when the thus clarified 
pathway Is judged as not Important In the biosynthesis of the target useful product, a useful mutant can be obtained 
by selecting a mutant wherein the utilization frequency of this pathway is lowered. 

5. Clarification or determination of useful mutation point 

[0131] Many useful mutants of coryneform bacteria which are suitable for the production of useful substances, such 
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as amino acids, nucleic acids, vitamins, saccharides, organic acids, and the like, have been obtained. However, it is 
hardly known which mutation point is imparted to a gene to improve the productivity. 

[0132] However, mutation points contained in production strains can be identified by comparing desired sequences 
of the genome DNA of the production strains obtained from coryneform bacteria by the mutagenic technique with the 
5 nucleotide sequences of the corresponding genome DNA and ORF derived from coryneform bacteria determined by 
the methods of the above items 1 and 2 and analyzing them 

[0133] Moreover, effective mutation points contributing to the production can be easily specified from among these 
mutation points on the basis of known information relating to the metabolic pathways, the metabolic regulatory mech- 
anisms, the structure activity correlation of enzymes, and the like. 
10 [01 34] When any efficient mutation can be hardly specified based on known data, the mutation points thus identified 
can be introduced into a wild strain of coryneform bacteria or a production strain free of the mutation. Then, it is examined 
whether or not any positive effect can be achieved on the production. 

[0135] For example, by comparing the nucleotide sequence of homoserine dehydrogenase gene horn of a lysine- 
producing B-6 strain of Corynebacterium glutamicum {AppL Microbiol. BiotechnoL, 32: 269-273 (1989)) with the nu- 
15 cleotide sequence corresponding to the genome of Corynebacterium glutamicum AJCC 1 3032 according to the present 
invention, a mutation of amino acid replacement In which valine at the 59-posltion is replaced with alanine (Val59Ala) 
was identified. A strain obtained by introducing this mutation into the ATCC 13032 strain by the gene replacement 
method can produce lysine, which indicates that this mutation is an effective mutation contributing to the production 
of lysine. 

20 [0136] Similarly, by comparing the nucleotide sequence of pyruvate carboxylase gene pyc of the B-6 strain with the 
nucleotide sequence corresponding to the ATCC 1 3032 genome, a mutation of amino acid replacement in which proline 
at the 458-position was replaced with serine (Pro458Ser) was identified. A strain obtained by Introducing this mutation 
into a lysine-producing strain of No. 58 (PERM BP-7134) of Corynebacterium glutamicum free of this mutation shows 
an improved lysine productivity in comparison with the No. 58 strain, which indicates that this mutation is an effective 

25 mutation contributing to the production of lysine. 

[0137] In addition, a mutation A1a213Thr in glucose-6-phosphate dehydrogenase was specified as an effective mu- 
tation relating to the production of lysine by detecting glucose-6-phosphate dehydrogenase gene zwf of the B-6 strain. 
[0138] Furthermore, the lysine-productivity of Corynebacterium glutamicum was improved by replacing the base at 
the 932-position of aspartokinase gene lysC of the Corynebacterium glutamicum ATCC 1 3032 genome with cytosine 

30 to thereby replace threonine at the 311 -position by isoleucine, which Indicates that this mutation Is an effective mutation 
contributing to the production of lysine. 

[01 39] Also, as another method to examine whether or not the identified mutation point is an effective mutation, there 
Is a method in which the mutation possessed by the lysine-producing strain is returned to the sequence of a wild type 
strain by the gene replacement-method and whether or not it has a negative influence on the lysine productivity. For 
35 example, when the amino acid replacement mutation Val59Ala possessed by horn of the lysine-producing B-6 strain 
was returned to a wild type amino acid sequence, the lysine productivity was lowered in comparison with the B-6 strain. 
Thus, it was found that this mutation is an effective mutation contributing to the production of lysine. 
[0140] Effective mutation points can be more efficiently and comprehensively extracted by combining, if needed, the 
DNA array analysis or proteome analysis described below. 

40 

6. Method of breeding industrially advantageous production strain 

[0141] It has been a general practice to construct production strains, which are used industrially in the fermentation 
production of the target useful substances, such as amino acids, nucleic acids, vitamins, saccharides, organic acids, 
^5 and the like, by repeating mutagenesis and breeding based on random mutagenesis using mutagens, such as NTG 
or the like, and screening. 

[0142] In recent years, many examples of improved production strains have been made through the use of recom- 
binant DNA techniques. In breeding, however, most of the parent production strains to be improved are mutants ob- 
tained by a conventional mutagenic procedure (W. Leuchtenberger, Amino Acids - Technical Production and Use. In: 
50 Roehr (ed) Biotechnology, second edition, vol. 6, products of primary metabolism. VCH Verlagsgesellschaft mbH, Wein- 
heim, P 465 (1996)). 

[0143] Although mutagenesis methods have largely contributed to the progress of the fermentation industry, they 
suffer from a serious problem of multiple, random introduction of mutations Into every part of the chromosome. Since 
many mutations are accumulated in a single chromosome each time a strain is improved, a production strain obtained 
55 by the random mutation and selecting is generally inferior in properties (for example, showing poor growth, delayed 
consumption of saccharides, and poor resistance to stresses such as temperature and oxygen) to a wild type strain, 
which brings about troubles such as failing to establish a sufficiently elevated productivity, being frequently contami- 
nated with miscellaneous bacteria, requiring troublesome procedures in culture maintenance, and the like, and, in its 
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turn, elevating the production cost in practice. In addition, the Innprovement in the productivity is based on random 
mutations and thus the mechanism thereof is unclear. Therefore, it is very difficult to plan a rational breeding strategy 
for the subsequent improvement in the productivity. 

[0144] According to the present invention, effective mutation points contributing to the production can be efficiently 
5 specified from among many mutation points accumulated in the chromosome of a production strain which has been 
bred from coryneform bacteria and, therefore, a novel breeding method of assembling these effective mutations in the 
coryneform bacteria can be established. Thus, a useful production strain can be reconstructed. It is also possible to 
construct a useful production strain from a wild type strain. 
[0145] Specifically, a useful mutant can be constructed in the following manner. 
10 [0146] One of the mutation points is incorporated into a wild type strain of coryneform bacteria. Then, it is examined 
whether or not a positive effect is established on the production. When a positive effect is obtained, the mutation point 
is saved. When no effect is obtained, the mutation point is removed. Subsequently, only a strain having the effective 
mutation point is used as the parent strain, and the same procedure is repeated. In general, the effectiveness of a 
mutation positioned upstream cannot be clearly evaluated in some cases when there is a rate-determining point in the 
15 downstream of a biosynthesis pathway. It is therefore preferred to successively evaluate mutation points upward from 
downstream. 

,[0147] By reconstituting effective mutations by the method as described above in a wild type strain or a strain which 
has a high growth speed or the same ability to consume saccharides as the wild type strain, it is possible to construct 
an industrially advantageous strain which is free of troubles in the previous methods as described above and to conduct 

20 fermentation production using such strains within a short time or at a higher temperature. 

[0148] For example, a lysine-producing mutant B-6 {Appf. Microbiol. Biotechnoi,, 32: 262-273 (1989)), which is ob- 
tained by multiple rounds of random mutagenesis from a wild type strain Corynebacterium glutamicum ATCC 13032, 
enables lysine fermentation to be performed at a temperature between 30 and 34°C but shows lowered growth and 
lysine productivity at a temperature exceeding 34°C. Therefore, the fermentation temperature should be maintained 

25 at 34°C or lower. In contrast thereto, the production strain described in the above item 5, which is obtained by recon- 
stituting effective mutations relating to lysine production, can achieve a productivity at 40 to 42°C equal or superior to 
the result obtained by culturing at 30 to 34°C. Therefore, this strain is industrially advantageous since it can save the 
load of cooling during the fermentation. 

[0149] When culture should be carried out at a high temperature exceeding 43°C, a production strain capable of 
30 conducting fermentation production at a high temperature exceeding 43°C can be obtained by reconstituting useful 
mutations in a microorganism belonging to the genus Corynebacterium which can grow at high temperature exceeding 
43°C. Examples of the microorganism capable of growing at a high temperature exceeding 43°C include Corynebac- 
terium thermoaminogenes, such as Corynebacterium thermoaminogenes PERM 9244, PERM 9245, PERM 9246 and 
PERM 9247. 

35 [0150] A strain having a further improved productivity of the target product can be obtained using the thus recon- 
structed strain as the parent strain and further breeding it using the conventional mutagenesis method, the gene am- 
plification method, the gene replacement method using the recombinant DNA technique, the transduction method or 
the cell fusion method. Accordingly, the microorganism of the present invention includes, but is not limited to, a mutant, 
a cell fusion strain, a transformant, a transductant or a recombinant strain constructed by using recombinant DNA 

40 techniques, so long as it is a producing strain obtained via the step of accumulating at least two effective mutations in 
a coryneform bacteria in the course of breeding. 

[0151] When a mutation point judged as being harmful to the growth or production is specified, on the other hand, 
it is examined whether or not the producing strain used at present contains the mutation point. When it has the mutation, 
it can be returned to the wild type gene and thus a further useful production strain can be bred. 
45 [0152] The breeding method as described above is applicable to microorganisms, other than coryneform bacteria, 
which have industrially advantageous properties (for example, microorganisms capable of quickly utilizing less expen- 
sive carbon sources, microorganisms capable of growing at higher temperatures). 

7. Production and utilization of polynucleotide array 

50 

(1 ) Production of polynucleotide array 

[0153] A polynucleotide array can be produced using the polynucleotide or oligonucleotide of the present invention 
obtained in the above items 1 and 2. 
55 [0154] Examples include a polynucleotide array comprising a solid support to which at least one of a polynucleotide 
comprising the nucleotide sequence represented by SEQ ID NOS:2 to 3501, a polynucleotide which hybridizes with 
the polynucleotide under stringent conditions, and a polynucleotide comprising 10 to 200 continuous nucleotides in 
the nucleotide sequence of the polynucleotide is adhered; and a polynucleotide array comprising a solid support to 
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which at least one of a polynucleotide encoding a polypeptide comprising the amino acid sequence represented by 
any one of SEQ ID NOS:3502 to 7001, a polynucleotide which hybridizes with the polynucleotide under stringent 
conditions, and a polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequences of the polynu- 
cleotides is adhered. 

5 [0155] Polynucleotide arrays of the present invention include substrates known in the art, such as a DNA chip, a 
DNA microarray and a DNA macroarray, and the like, and comprises a solid support and plural polynucleotides or 
fragments thereof which are adhered to the surface of the solid support. 
[0156] Examples of the solid support include a glass plate, a nylon membrane, and the like. 
[0157] The polynucleotides or fragments thereof adhered to the surface of the solid support can be adhered to the 

10 surface of the solid support using the general technique for preparing arrays. Namely, a method in which they are 
adhered to a chemically surface-treated solid support, for example, to which a polycation such as polylysine or the like 
has been adhered (Nat. Genet, 21: 1 5-19 (1 999)). The chemically surface-treated supports are commercially available 
and the commercially available solid product can be used as the solid support of the polynucleotide array according 
to the present invention. 

15 [01 58] As the polynucleotides or oligonucleotides adhered to the solid support, the polynucleotides and oligonucle- 
otides of the present invention obtained in the above Items 1 and 2 can be used. 

[01 59] The analysis described below can be efficiently performed by adhering the polynucleotides or oligonucleotides 
to the solid support at a high density, though a high fixation density is not always necessary, 

[01 60] Apparatus for achieving a high fixation density, such as an arrayer robot or the like, is commercially available 
20 from Takara Shuzo (GMS417 Arrayer), and the commercially available product can be used. 

[0161] Also, the oligonucleotides of the present invention can be synthesized directly on the solid support by the 
photolithography method or the like (Nat Genet, 21: 20-24 (1999)). In this method, a linker having a protective group 
which can be removed by light irradiation is first adhered to a solid support, such as a slide glass or the like. Then, it 
is irradiated with light through a mask (a photolithograph mask) permeating light exclusively at a definite part of the 
25 adhesion part. Next, an oligonucleotide having a protective group which can be removed by light irradiation is added 
to the part. Thus, a ligation reaction with the nucleotide arises exclusively at the irradiated part. By repeating this 
procedure, oligonucleotides, each having a desired sequence, different from each other can be synthesized in respec- 
tive parts. Usually, the oligonucleotides to be synthesized have a length of 10 to 30 nucleotides. 

30 (2) Use of polynucleotide array 

[0162] The following procedures (a) and (b) can be carried out using the polynucleotide array prepared in the above 
(1). 

35 (a) Identification of mutation point of coryneform bacterium mutant and analysis of expression amount and expression 
profile of gene encoded by genome 

[0163] By subjecting a gene derived from a mutant of coryneform bacteria or an examined gene to the following 
steps (i) to (iv), the mutation point of the gene can be identified or the expression amount and expression profile of the 
40 gene can be analyzed: 

(i) producing a polynucleotide array by the method of the above (1 ); 

(ii) incubating polynucleotides immobilized on the polynucleotide array together with the labeled gene derived from 
a mutant of the coryneform bacterium using the polynucleotide array produced in the above (i) under hybridization 

45 conditions; 

(iii) detecting the hybridization; and 

(iv) analyzing the hybridization data. 

[0164] The gene derived from a mutant of coryneform bacteria or the examined gene include a gene relating to 
50 biosynthesis of at least one selected from amino acids, nucleic acids, vitamins, saccharides, organic acids, and ana- 
logues thereof. 

[0165] The method will be described in detail. 

[0166] A single nucleotide polymorphism (SNP) in a human region of 2,300 kb has been identified using polynucle- 
otide arrays (Science, 280: 1 077-82 (1 998)). In accordance with the method of identifying SNP and methods described 
55 in Science, 278: 680-686 (1997); Proa Natl. Acad. Sci. USA, 9e. 12833-38 (1999); Science, 284: 1520-23 (1999), and 
the like using the polynucleotide array produced in the above (1) and a nucleic acid molecule (DNA, RNA) derived from 
coryneform bacteria in the method of the hybridization, a mutation point of a useful mutant, which is useful in producing 
an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, or the like can be identified and the gene 
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expression amount and the expression profile thereof can be analyzed. 

[0167] The nucleic acid molecule (DNA, RNA) derived from the coryneform bacteria can be obtained according to 
the general method described in Molecular Cloning, 2nd ed. or the like. mRNA derived from Corynebacterium glutami- 
cum can also be obtained by the method of Bormann et al. {Molecular Microbiology, &. 31 7-326 (1 992)) or the like. 
5 [0168] ' Although ribosomal RNA (rRNA) is usually obtained In large excess in addition to the target mRNA, the anal- 
ysis is not seriously disturbed thereby. 

[0169] The resulting nucleic acid molecule derived from coryneform bacteria is labeled. Labeling can be carried out 
according to a method using a fluorescent dye, a method using a radioisotope or the like. 

[0170] Specific examples include a labeling method in which psoralen-biotin is crosslinked with RNA extracted from 
10 a microorganism and, after hybridization reaction, a fluorescent dye having streptoavidin bound thereto is bound to 
the biotin moiety (Nat. BiotechnoL, 16: 45-48 (1998)); a labeling method in which a reverse transcription reaction is 
carried out using RNA extracted from a microorganism as a template and random primers as primers, and dUTP having 
a fluorescent dye (for example, Cy3, Cy5) (manufactured by Amersham Pharmacia Biotech) is incorporated into cDNA 
{Proc. Natl. Acad. Sci. USA, 96. 12833-38 (1999)); and the like. 
15 [0171] The labeling specificity can be improved by replacing the random primers by sequences complementary to 
the 3'-end of ORF {J. BacterioL, 181: 6425-40 (1999)). 

[0172] In the hybridization method, the hybridization and subsequent washing can be carried out by the general 
method {Nat, Bioctechnol., 14: 1675-80 (1996), or the like). 

[0173] Subsequently, the hybridization intensity is measured depending on the hybridization amount of the nucleic 
20 acid molecule used in the labeling. Thus, the mutation point can be identified and the expression amount of the gene 
can be calculated. 

[0174] The hybridization intensity can be measured by visualizing the fluorescent signal, radioactivity, luminescence 
dose, and the like, using a laser confocal microscope, a CCD camera, a radiation imaging device (for example, STORM 
manufactured by Amersham Pharmacia Biotech), and the like, and then quantifying the thus visualized data. 
25 [01 75] A polynucleotide array on a solid support can also be analyzed and quantified using a commercially available 
apparatus, such as GMS418 Array Scanner (manufactured by Takara Shuzo) or the like. 

[01 76] The gene expression amount can be analyzed using a commercially available software (for example, ImaGene 
manufactured by Takara Shuzo; Array Gauge manufactured by Fuji Photo Film; ImageQuant manufactured by Amer- 
sham Pharmacia Biotech, or the like). 
30 [0177] A fluctuation in the expression amount of a specific gene can be monitored using a nucleic acid molecule 
obtained in the time course of culture as the nucleic acid molecule derived from coryneform bacteria. The culture 
conditions can be optimized by analyzing the fluctuation. 

[0178] The expression profile of the microorganism at the total gene level (namely, which genes among a great 
number of genes encoded by the genome have been expressed and the expression ratio thereof) can be determined 
35 using a nucleic acid molecule having the sequences of many genes determined from the full genome sequence of the 
microorganism. Thus, the expression amount of the genes determined by the full genome sequence can be analyzed 
and, in its turn, the biological conditions of the microorganism can be recognized as the expression pattern at the full 
gene level. 

40 (b) Confirmation of the presence of gene homologous to examined gene in coryneform bacteria 

[0179] Whether or not a gene homologous to the examined gene, which is present in an organism other than co- 
ryneform bacteria, is present in coryneform bacteria can be detected using the polynucleotide array prepared in the 
above (1). 

45 [0180] This detection can be carried out by a method in which an examined gene which is present in an organism 
other than coryneform bacteria is used instead of the nucleic acid molecule derived from coryneform bacteria used in 
the above identification/analysis method of (1). 

8. Recording medium storing full genome nucleotide sequence and ORF data and being readable by a computer and 
50 methods for using the same 

[0181] The term "recording medium or storage device which is readable by a computer" means a recording medium 
or storage medium which can be directly readout and accessed with a computer. Examples include magnetic recording 
media, such as a floppy disk, a hard disk, a magnetic tape, and the like; optical recording media, such as CD-ROM, 
55 CD-R, CD-RW, DVD-ROM, DVD-RAM, DVD-RW, and the like; electric recording media, such as RAM, ROM, and the 
like; and hybrids in these categories (for example, magnetic/optical recording media, such as MO and the like). 
[0182] Instruments for recording or inputting in or on the recording medium or instruments or devices for reading out 
the information in the recording medium can be appropriately selected, depending on the type of the recording medium 
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and the access device utilized. Also, various data processing programs, software, comparator and formats are used 
for recording and utilizing the polynucleotide sequence information or the like, of the present invention in the recording 
medium. The information can be expressed in the form of a binary file, a text file or an ASCII file formatted with com- 
mercially available software, for example. Moreover, software for accessing the sequence information is available and 

5 known to one of ordinary skill in the art. 

[01 83] Examples of the information to be recorded in the above-described medium Include the full genome nucleotide 
sequence information of coryneform bacteria as obtained in the above item 2, the nucleotide sequence information of 
ORF, the amino acid sequence information encoded by the ORF, and the functional information of polynucleotides 
coding for the amino acid sequences. 

10 [0184] The recording medium or storage device which is readable by a computer according to the present invention 
refers to a medium in which the information of the present invention has been recorded. Examples include recording 
media or storage devices which are readable by a computer storing the nucleotide sequence information represented 
by SEQ ID NOS:1 to 3501, the amino acid sequence information represented by SEQ ID NOS:3502 to 7001, the 
functional information of the nucleotide sequences represented by SEQ ID N0S:1 to 3501 , the functional information 

15 of the amino acid sequences represented by SEQ ID NOS:3502 to 7001, and the information listed in Table 1 below 
and the like. 

9. System based on a computer using the recording medium of the present invention which Is readable by a computer 

20 [0185] The term "system based on a computer" as used herein refers a system composed of hardware device(s), 
software device(s), and data recording device(s) which are used for analyzing the data recorded in the recording me- 
dium of the present invention which is readable by a computer. 

[01 86] The hardware device(s) are. for example, composed of an input unit, a data recording unit, a central processing 
unit and an output unit collectively or individually. 

25 [0187] By the software device(s), the data recorded in the recording medium of the present invention are searched 
or analyzed using the recorded data and the hardware device(s) as described herein. Specifically, the software device 
(s) contain at least one program which acts on or with the system in order to screen, analyze or compare biologically 
meaningful structures or information from the nucleotide sequences, amino acid sequences and the like recorded in 
the recording medium according to the present invention. 

30 [0188] Examples of the software device(s) for identifying ORF and EMF domains include GeneMark {Nuc. Acids. 
Res., 22, 4756-67 (1994)), GeneHacker (Prote/n, Nucleic Acid and Enzyme, 42: 3001-07 (1997)), Glimmer (The Insti- 
tute of Genomic Research; Nuc. Acids. Res., 26: 544-548 (1998)) and the like. In the process of using such a software 
device, the default (initial setting) parameters are usually used, although the parameters can be changed, if necessary, 
in a manner known to one of ordinary skill in the art. 

35 [0189] Examples of the software device(s) for identifying a genome domain or a polypeptide domain analogous to 
the target sequence or the target structural motif (homology searching) include FASTA, BLAST, Smith -Waterman, 
GenetyxMac (manufactured by Software Development), GOG Package (manufactured by Genetic Computer Group), 
GenCore (manufactured by Compugen), and the like. In the process of using such a software device, the default (initial 
setting) parameters are usually used, although the parameters can be changed, if necessary, in a manner known to 

40 one of ordinary skill in the art. 

[0190] Such a recording medium storing the full genome sequence data is useful in preparing a polynucleotide array 
by which the expression amount of a gene encoded by the genome DNA of coryneform bacteria and the expression 
profile at the total gene level of the microorganism, namely, which genes among many genes encoded by the genome 
have been expressed and the expression ratio thereof, can be determined. 

45 [0191] The data recording device(s) provided by the present invention are, for example, memory device(s) for re- 
cording the data recorded in the recording medium of the present invention and target sequence or target structural 
motif data, or the like, and a memory accessing device(s) for accessing the same. 

[0192] Namely, the system based on a computer according to the present invention comprises the following: 

50 (i) a user input device that inputs the information stored in the recording medium of the present invention, and 

target sequence or target structure motif information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the information stored in the recording medium of the present invention with the 
target sequence or target structure motif information, recorded by the data storing device of (ii) for screening and 

55 analyzing nucleotide sequence information which is coincident with or analogous to the target sequence or target 

structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 
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[0193] This system is usable in the methods in items 2 to 5 as described above for searching and analyzing the ORF 
and EMF domains, target, sequence, target structural motif, etc. of a coryneform bacterium, searching homologs, 
searching and analyzing isozymes, determining the biosynthesis pathway and the signal transmission pathway, and 
identifying spots which have been found in the proteome analysis. The term "homologs" as used herein Includes both 
5 of orthologs and paralogs. 

10. Production of polypeptide using ORF derived from coryneform bacteria 

[01 94] The polypeptide of the present invention can be produced using a polynucleotide comprising the ORF obtained 
10 in the above item 2. Specifically, the polypeptide of the present invention can be produced by expressing the polynu- 
cleotide of the present invention or a fragment thereof in a host cell, using the method described in Molecular Cloning, 
2nd ed., Current Protocols in Molecular Biology, and the like, for example, according to the following method. 
[0195] A DNA fragment having a suitable length containing a part encoding the polypeptide is prepared from the full 
length ORF sequence, if necessary. 
15 [0196] Also, DNA in which nucleotides in a nucleotide sequence at a part encoding the polypeptide of the present 
invention are replaced to give a codon suitable for expression of the host cell, if necessary. The DNA is useful for 
efficiently producing the polypeptide of the present invention. 

[0197] A recombinant vector is prepared by inserting the DNA fragment into the downstream of a promoter in a 
suitable expression vector. 
20 [0198] The recombinant vector Is introduced to a host cell suitable for the expression vector. 

[0199] Any of bacteria, yeasts, animal ceils, insect cells, plant cells, and the like can be used as the host cell so long 
as it can be expressed in the gene of interest. 

[0200] Examples of the expression vector include those which can replicate autonomously in the above-described 
host cell or can be integrated into chromosome and have a promoter at such a position that the DNA encoding the 

25 polypeptide of the present invention can be transcribed. 

[0201] When a procaryote cell, such as a bacterium or the like, is used as the host cell, it is preferred that the 
recombinant vector containing the DNA encoding the polypeptide of the present invention can replicate autonomously 
in the bacterium and is a recombinant vector constituted by, at least a promoter, a ribosome binding sequence, the 
DNA of the present invention and a transcription termination sequence. A promoter controlling gene can also be con- 

30 tained therewith in operable combination. 

[0202] Examples of the expression vectors include a vector plasm id which is replicable in Corynebacterium glutami- 
cum, such as pCGI (Japanese Published Unexamined Patent Application No. 134500/82), pCG2 (Japanese Published 
Unexamined Patent Application No. 36197/83), pCG4 (Japanese Published Unexamined Patent Application No. 
183799/82), pGGII (Japanese Published Unexamined Patent Application No. 134500/32), pCG116, pGE54 and 

35 pGBI 01 (Japanese Published Unexamined Patent Application No. 1 05999/83), pCE51 , pCE52 and pCE53 {MoL Gen. 
Genet, 196: 175-178 (1984)), and the like; a vector plasmid which is replicable in Escherichia coli, such as pET3 and 
pETII (manufactured by Stratagene), pBAD, pThioHis and pTrcHis (manufactured by Invitrogen), pKK223-3 and 
pGEX2T (manufactured by Amersham Pharmacia Biotech), and the like; and pBTrp2, pBTacI and pBTac2 (manufac- 
tured by Boehringer Mannheim Co.), pSE280 (manufactured by Invitrogen), pGEMEX-1 (manufactured by Promega), 

40 pQE-8 (manufactured by QIAGEN), pKYPIO (Japanese Published Unexamined Patent Application No. 110600/83), 
pKYP200 (Agric. Biol. Chem., 48: 669 (1984)), pLSAI {Agric. Biol. Chem., 53: 277 (1989)), pGELI (Proc. Natl. Acad. 
Sci. USA, 82. 4306 (1985)), pBluescript II SK(-) (manufactured by Stratagene), pTrs30 (prepared from Escherichia coll 
JM109/pTrS30 (PERM BP-5407)), pTrs32 (prepared from Escherichia co// JM109/pTrS32 (PERM BP-5408)), pGHA2 
(prepared from Escherichia coli IGHA2 (PERM B-400), Japanese Published Unexamined Patent Application No. 

45 221 091/85), pGKA2 (preparedfrom Escherichia co//IGKA2 (PERM BP-6798), Japanese Published Unexamined Patent 
Application No. 221091/85), pTerm2 (U.S. Patents 4,686,191, 4,939,094 and 5,160,735), pSupex, pUBIIO, pTP5, 
pC194and pEG400 {J. BacterioL, 172: 2392 (1990)), pGEX (manufactured by Pharmacia), pET system (manufactured 
by Novagen), and the like. 

[0203] Any promoter can be used so long as it can function in the host cell. Examples include promoters derived 
50 from Escherichia coli, phage and the like, such as trp promoter (P^^p) , lac promoter, promoter, Pr promoter, T7 

promoter and the like. Also, artificially designed and modified promoters, such as a promoter in which two Ptrp are 

linked in series (P+rp><2) , tac promoter, /acT7 promoter /ert promoter and the like, can be used. 

[0204] It is preferred to use a plasmid in which the space between Shine-Dalgarno sequence which is the ribosome 

binding sequence and the Initiation codon is adjusted to an appropriate distance (for example, 6 to 18 nucleotides). 
55 [0205] The transcription termination sequence is not always necessary for the expression of the DNA of the present 

invention. However, it is preferred to arrange the transcription terminating sequence at just downstream of the structural 

gene. 

[0206] One of ordinary skill In the art will appreciate that the codons of the above-described elements may be opti- 
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mized, in a known manner, depending on the host cells and environmental conditions utilized. 
[0207] Examples of the host cell include microorganisms belonging to the genus Escherichia, the genus Serratia, 
the genus Bacillus, the genus Brevibacterium, the genus Corynebacterium, the genus Microbacterium, the genus Pseu- 
domonas, and the like. Specific examples Include Escherichia co// XL1 -Blue, Escherichia co// XL2-Blue, Escherichia 
5 CO// DH1, Escherichia co//MC1000, Escherichia co// KY3276, Escherichia co//W1485, Escherichia co//JM109, Es- 
cherichia CO// HB1 01, Escherichia coHNo. 49, Escherichia co// W3110, Escherichia co// NY49, Escherichia co// GI698, 
Escherichia co// TB1, Serratia ficaria, Serratia fonticola, Serratia liquefaciens, Serratia marcescens. Bacillus subtil/s, 
Bacillus amyloliquefaciens, Corynebacterium ammonia genes, Brevibacterium immariophilum AJCC 14068, Brevibac- 
terium saccharolyticum ATCC 14066, Corynebacterium glutamicum AJCC 13032, Co/ynedacter/u/n glutamicumATCC 

10 1 3869, Corynebacterium glutamicum ATCC 1 4067 (prior genus and species: Brevibacterium flavum), Corynebacterium 
giutamicum AJCC 13869 (prior genus and species: Brevibacterium lactofermentum, or Corynebacterium lactofermen- 
tum), Corynebacterium acetoacidophitum AJCC 13870, Corynebacterium thermoaminogenesFERM 9244, Microbac- 
terium ammoniaphilum AJCC 15354, Pseudomonas putida, Pseudomonas sp. D-0110, and the like. 
[0208] When Corynebacterium glutamicum or an analogous microorganism is used as a host, an EMF necessary 

15 for expressing the polypeptide is not always contained in the vector so long as the polynucleotide of the present in- 
vention contains an EMF. When the EMF Is not contained in the polynucleotide, it is necessary to prepare the EMF 
separately and ligate it so as to be in operable combination. Also, when a higher expression amount or specific ex- 
pression regulation is necessary, it is necessary to ligate the EMF corresponding thereto so as to put the EMF in 
operable combination with the polynucleotide. Examples of using an externally ligated EMF are disclosed in Microbi- 

20 ology 142. 1297-1309 (1996). 

[0209] With regard to the method for the introduction of the recombinant vector, any method for introducing DN A into 
the above-described host cells, such as a method in which a calcium ion is used {Proc. Natl. Acad. Sci. USA, 69: 2110 
(1972)), a protoplast method (Japanese Published Unexamined Patent Application No. 2483942/88), the methods 
described in Gene, 17: 107 (1982) and Molecular & General Genetics, 168: 111 (1979) and the like, can be used. 

25 [0210] When yeast is used as the host cell, examples of the expression vector include pYES2 (manufactured by 
Invitrogen), YEp13 (ATCC 37115), YEp24 (ATCC 37051), YCp50 (ATCC 37419), pHS19, pHS15, and the like. 
[0211] Any promoter can be used so long as it can be expressed in yeast. Examples Include a promoter of a gene 
in the glycolytic pathway, such as hexose kinase and the like, PH05 promoter, PGK promoter, GAP promoter, ADH 
promoter, gal 1 promoter, gal 10 promoter, a heat shock protein promoter, MF al promoter, CUP 1 promoter, and the like. 

30 [0212] Examples of the host cell include microorganisms belonging to the genus Saccharomyces, the genus 
Schizosaccharomyces, the genus Kluyveromyces, the genus Trichosporon, the genus Schwanniomyces, the genus 
Pichia, the genus Candida and the like. Specific examples include Saccharomyces cerevisiae, Schizosaccharomyces 
pombe, Kluyveromyces lactis, Trichosporon pullulans, Schwanniomyces alluvius, Candida utilis and the like. 
[G213] With regard to the method for the introduction of the recombinant vector, any method for introducing DNA into 

35 yeast, such as an elect ropo ration method {Methods. EnzymoL, 194: 182 (1990)), a spheroplast method {Proc. Natl. 
Acad. Sci USA, 75: 1929 (1978)), a lithium acetate method (J. BacterioL, 153: 163 (1983)), a method described in 
Proc. Natl. Acad. Sci. USA, 75: 1929 (1978) and the like, can be used. 

[0214] When animal cells are used as the host cells, examples of the expression vector include pcDNA3.1 , pSinRep5 
and pCEP4 (manufactured by Invitorogen), pRev-Tre (manufactured by Clontech), pAxCAwt (manufactured by Takara 
^0 Shuzo), pcDNAI and pcDM8 (manufactured by Funakoshi), pAGE107 (Japanese Published Unexamined Patent Ap- 
plication No, 22979/91; Cytotechnoiogy, 3:133 (1990)), pAS3-3 (Japanese Published Unexamined Patent Application 
No. 227075/90), pcDM8 (Nature, 329. 840 (1987)), pcDNAI/Amp (manufactured by Invitrogen), pREP4 (manufactured 
by Invitrogen), pAGE103 (J. Biochem., 101: 1307 (1987)), pAGE210, and the like. 

[0215] Any promoter can be used so long as it can function in animal cells. Examples include a promoter of IE 
45 (immediate early) gene of cytomegalovirus (CMV), an early promoter of SV40, a promoter of retrovirus, a metal- 
lothioneln promoter, a heat shock promoter, SRa promoter, and the like. Also, the enhancer of the IE gene of human 
CMV can be used together with the promoter. 

[0216] Examples of the host cell Include human Namalwa cell, monkey COS cell, Chinese hamster CHO cell, 
HST5637 (Japanese Published Unexamined Patent Application No. 299/88), and the like, 
50 [0217] The method for introduction of the recombinant vector into animal cells is not particularly limited, so long as 
it Is the general method for introducing DNA Into animal cells; such as an electroporation method {Cytotechnoiogy, 3. 
133 (1990)), a calcium phosphate method (Japanese Published Unexamined Patent Application No. 227075/90), a 
lipofection method (Proc. Natl. Acad. Sci. USA, 84, 7413 (1987)), the method described in Virology 52. 456 (1973), 
and the like. 

55 [0218] When Insect cells are used as the host cells, the polypeptide can be expressed, for example, by the method 
described in Bacurovirus Expression Vectors, A Laboratory Manual, W.H. Freeman and Company, New York (1992), 
Bio/Technology, &. 47 (1988). or the like. 

[0219] Specifically, a recombinant gene transfer vector and bacurovirus are simultaneously inserted Into insect cells 
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to obtain a recombinant virus in an insect cell culture supernatant, and then the insect cells are infected with the resulting 
recombinant virus to express the polypeptide. 

[0220] Examples of the gene introducing vector used in the method include pBlueBac4.5, pVL1392, pVL1393 and 
pBlueBaclll (manufactured by Invitrogen), and the like. 
5 [0221] Examples of the bacurovirus include Autographa californica nuclear polyhedrosis virus with which insects of 
the family Barathra are infected, and the like. 

[0222] Examples of the insect cells include Spodoptera frugiperda oocytes Sf9 and Sf21 (Bacurovirus Expression 
Vectors, A Laboratory Manual, W.H. Freeman and Company, New York (1992)), Trichoplusia n/' oocyte High 5 (manu- 
factured by Invitrogen) and the like. 
10 [0223] The method for simultaneously incorporating the above-described recombinant gene transfer vector and the 
above-described bacurovirus for the preparation of the recombinant virus include calcium phosphate method (Japanese 
Published Unexamined Patent Application No. 227075/90), lipofection method (Proc. Natl. Acad. Sci. USA, 84: 7413 
(1987)) and the like. 

[0224] When plant cells are used as the host cells, examples of expression vector include a Ti plasmid, a tobacco 
15 mosaic virus vector, and the like. 

[0225] Any promoter can be used so long as it can be expressed in plant cells. Examples include 35S promoter of 
cauliflower mosaic virus (CaMV), rice actin 1 promoter, and the like. 

[0226] Examples of the host cells include plant cells and the like, such as tobacco, potato, tomato, carrot, soybean, 
rape, alfalfa, rice, wheat, barley, and the like. 

20 [0227] The method for introducing the recombinant vector is not particularly limited, so long as it is the general method 
for introducing DNA into plant cells, such as the Agrobacterium method (Japanese Published Unexamined Patent 
Application No. 140885/84, Japanese Published Unexamined Patent Application No. 70080/85, WO 94/00977), the 
electroporation method (Japanese Published Unexamined Patent Application No. 251 887/85), the particle gun method 
(Japanese Patents 2606856 and 2517813), and the like. 

25 [0228] The transformant of the present invention includes a transformant containing the polypeptide of the present 
invention perse rather than as a recombinant vector, that is, a transformant containing the polypeptide of the present 
invention which is integrated into a chromosome of the host, in addition to the transformant containing the above 
recombinant vector. 

[0229] When expressed in yeasts, animal cells, insect cells or plant cells, a glycopolypeptide or glycosylated polypep- 
30 tide can be obtained. 

[0230] The polypeptide can be produced by culturing the thus obtained transformant of the present invention in a 
culture medium to produce and accumulate the polypeptide of the present invention or any polypeptide expressed 
under the control of an EMF of the present invention, and recovering the polypeptide from the culture. 
[0231] Culturing of the transformant of the present invention in a culture medium is carri ed out according to the 
35 conventional method as used in culturing of the host. 

[0232] When the transformant of the present invention Is obtained using a prokaryote, such as Escherichia coli or 
the tike, or a eukaryote, such as yeast or the like, as the host, the transformant is cultured. 

[0233] Any of a natural medium and a synthetic medium can be used, so long as it contains a carbon source, a 
nitrogen source, an inorganic salt and the like which can be assimilated by the transformant and can perform culturing 
40 of the transformant efficiently. 

[0234] Examples of the carbon source include those which can be assimilated by the transformant, such as carbo- 
hydrates (for example, glucose, fructose, sucrose, molasses containing them, starch, starch hydrolysate, and the like), 
organic acids (for example, acetic acid, propionic acid, and the like), and alcohols (for example, ethanol, propanoi, and 
the like). 

45 [0235] Examples of the nitrogen source include ammonia, various ammonium salts of inorganic acids or organic 
acids (for example, ammonium chloride, ammonium sulfate, ammonium acetate, ammonium phosphate, and the like), 
other nitrogen-containing compounds, peptone, meat extract, yeast extract, corn steep liquor, casein hydrolysate, soy- 
bean meal and soybean meal hydrolysate, various fermented cells and hydrolysates thereof, and the like. 
[0236] Examples of inorganic salt include potassium dihydrogen phosphate, dipotassium hydrogen phosphate, mag- 

50 nesium phosphate, magnesium sulfate, sodium chloride, ferrous sulfate, manganese sulfate, copper sulfate, calcium 
carbonate, and the like. 

[0237] The culturing is carried out under aerobic conditions by shaking culture, submerged-aeration stirring culture 
or the like. The culturing temperature is preferably from 15 to 40°C, and the culturing time is generally from 16 hours 
to 7 days. The pH of the medium is preferably maintained at 3.0 to 9.0 during the culturing. The pH can be adjusted 
55 using an inorganic or organic acid, an alkali solution, urea, calcium carbonate, ammonia, or the like. 

[0238] Also, antibiotics, such as ampicillin, tetracycline, and the like, can be added to the medium during the culturing, 
if necessary. 

[0239] When a microorganism transformed with a recombinant vector containing an inducible promoter is cultured, 
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an inducer can be added to the medium, if necessary, 

[0240] For example, isopropyl-p-D-thiogalactopyranoside (IPTG) or the like can be added to the medium when a 
microorganism transformed with a recombinant vector containing lac promoter is cultured, or indoleacrylic acid (lAA) 
or the like can by added thereto when a microorganism transformed with an expression vector containing trp promoter 
5 is cultured. 

[0241 ] Examples of the medium used in culturing a transformant obtained using animal cells as the host cells include 
RPM1 1640 medium (The Journal of the American Medical Association, 199: 519 (1967)), Eagle's MEM medium {Sci- 
ence, 122: 501 (1952)), Dulbecco's modified MEM medium {Virology, 8, 396 (1959)), 199 Medium {Proceeding of the 
Society for the Biological Medicine, 73:1 (1950)), the above-described media to which fetal calf serum has been added, 
10 and the like. 

[0242] The culturing is carried out generally at a pH of 6 to 8 and a temperature of 30 to 40°C in the presence of 5% 
CO2 for 1 to 7 days. 

[0243] Also, if necessary, antibiotics, such as kanamycin, penicillin, and the like, can be added to the medium during 
the culturing. 

15 [0244] Examples of the medium used In culturing a transformant obtained using insect cells as the host cells include 
TNM-FH medium (manufactured by Pharmlngen), Sf-900 II SFM (manufactured by Life Technologies), ExCell 400 and 
ExCell 405 (manufactured by JRH Biosciences), Grace's Insect Medium (Nature, 195: 788 (1962)), and the like. 
[0245] The culturing is carried out generally at a pH of 6 to 7 and a temperature of 25 to 30^0 for 1 to 5 days. 
[0246] Additionally, antibiotics, such as gentamicin and the like, can be added to the medium during the culturing, if 

20 necessary. 

[0247] A transformant obtained by using a plant cell as the host cell can be used as the cell or after differentiating 
to a plant cell or organ. Examples of the medium used in the culturing of the transformant include Murashige and Skoog 
(MS) medium, White medium, media to which a plant hormone, such as auxin, cytokinine, or the like has been added, 
and the like. 

25 [0248] The culturing is carried out generally at a pH of 5 to 9 and a temperature of 20 to AOPC for 3 to 60 days. 

[0249] Also, antibiotics, such as kanamycin, hygromycin and the like, can be added to the medium during the cul- 
turing, if necessary. 

[0250] As described above, the polypeptide can be produced by culturing a transformant derived from a microor- 
ganism, animal cell or plant cell containing a recombinant vector to which a DNA encoding the polypeptide of the 
30 present invention has been inserted according to the general culturing method to produce and accumulate the polypep- 
tide, and recovering the polypeptide from the culture. 

[0251] The process of gene expression may include secretion of the encoded protein production or fusion protein 
expression and the like in accordance with the methods described in Molecular Cloning, 2nd ed., in addition to direct 
expression. 

35 [0252] The method for producing the polypeptide of the present invention includes a method of intracellular expres- 
sion in a host cell, a method of extracellular secretion from a host cell, or a method of production on a host cell membrane 
outer envelope. The method can be selected by changing the host cell employed or the structure of the polypeptide 
produced. 

[0253] When the polypeptide of the present invention is produced in a host cell or on a host cell membrane outer 
40 envelope, the polypeptide can be positively secreted extracellularly according to, for example, the method of Paulson 
etal. {J. BioL Chem., 264: 77679(1989)), the method of Lowe etal. (Proa Natl. Acad, Sci, USA, 86: 8227 (1989); 
Genes Develop., 4: 1288 (1990)), and/or the methods described in Japanese Published Unexamined Patent Application 
No. 336963/93, WO 94/23021, and the like. 

[0254] Specifically, the polypeptide of the present invention can be positively secreted extracellularly by expressing 
45 it in the form that a signal peptide has been added to the foreground of a polypeptide containing an active site of the 
polypeptide of the present invention according to the recombinant DNA technique. 

[0255] Furthermore, the amount produced can be increased using a gene amplification system, such as by use of 
a dihydrofolate reductase gene or the like according to the method described in Japanese Published Unexamined 
Patent Application No. 227075/90. 
50 [0256] Moreover, the polypeptide of the present invention can be produced by a transgenic animal individual (trans- 
genic nonhuman animal) or plant individual (transgenic plant). 

[0257] When the transformant is the animal individual or plant individual, the polypeptide of the present invention 
can be produced by breeding or cultivating it so as to produce and accumulate the polypeptide, and recovering the 
polypeptide from the animal individual or plant individual. 
55 [0258] Examples of the method for producing the polypeptide of the present invention using the animal individual 
include a method for producing the polypeptide of the present invention in an animal developed by inserting a gene 
according to methods known to those of ordinary skill in the art (American Journal of Clinical Nutrition, 63: 639S (1 996), 
American Journal of Clinical Nutrition, 6S. 627S (1996), Bio/Technology 9: 830 (1991)). 
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[0259] In the animal individual, the polypeptide can be produced by breeding a transgenic nonhuman animal to which 
the DNA encoding the polypeptide of the present invention has been inserted to produce and accumulate the polypep- 
tide in the animal, and recovering the polypeptide from the animal. Examples of the production and accumulation place 
In the animal include milk (Japanese Published Unexamined Patent Application No. 309192/88), egg and the like of 
5 the animal. Any promoter can be used, so long as It can be expressed in the animal. Suitable examples Include an a- 
casein promoter, a (^-casein promoter, a p-lactogtobulin promoter, a whey acidic protein promoter, and the like, which 
are specific for mammary glandular cells. 

[0260] Examples of the method for producing the polypeptide of the present invention using the plant individual 
include a method for producing the polypeptide of the present Invention by cultivating a transgenic plant to which the 
10 DNA encoding the protein of the present Invention by a known method (Tissue Culture, 20 (1994), Tissue Culture, 21 
(1 994), Trends in Biotechnology, 15:45 (1997)) to produce and accumulate the polypeptide in the plant, and recovering 
the polypeptide from the plant. 

[0261] The polypeptide according to the present invention can also be obtained by translation in vitro. 

[0262] The polypeptide of the present invention can be produced by a translation system in vitro. There are, for 

15 example, two in wYro translation methods which may be used, namely, a method using RNA as a template and another 
method using DNA as a template. The template RNA includes the whole RNA, mRNA, an in wfro transcription product, 
and the like. The template DNA includes a plasmid containing a transcriptional promoter and a target gene integrated 
therein and downstream of the Initiation site, a PCR/RT-PCR product and the like. To select the most suitable system 
for the in vitro translation, the origin of the gene encoding the protein to be synthesized (prokaryotic cell/eucaryotic 

20 cell), the type of the template (DNA/RNA), the purpose of using the synthesized protein and the like should be consid- 
ered. In vitro translation kits having various characteristics are commercially available from many companies (Boe- 
hringer Mannheim, Promega, Stratagene, or the like), and every kit can be used In producing the polypeptide according 
to the present Invention. 

[0263] Transcription/translation of a DNA nucleotide sequence cloned into a plasmid containing a T7 promoter can 
25 be carried out using an in vitro transcription/translation system E. coli T7 S30 Extract System for Circular DNA (man- 
ufactured by Promega, catalogue No. L1130). Also, transcription/translation using, as a template, a linear prokaryotic 
DNA of a supercoil non-sensitive promoter, such as /acUV5, tac, XPL(con), XPL, or the like, can be carried out using 
an in wYro transcription/translation system E. co// 830 Extract System for Linear Templates (manufactured by Promega, 
catalogue No. L1030). Examples of the linear prokaryotic DNA used as a template include a DNA fragment, a PCR- 
30 amplified DNA product, a duplicated oligonucleotide ligation, an in vitro transcriptional RNA, a prokaryotic RNA, and 
the like. 

[0264] In addition to the production of the polypeptide according to the present invention, synthesis of a radioactive 
labeled protein, confirmation of the expression capability of a cloned gene, analysis of the function of transcriptional 
reaction or translation reaction, and the like can be carried out using this system. 

35 [0265] The polypeptide produced by the transformant of the present invention can be isolated and purified using the 
general method for isolating and purifying an enzyme. For example, when the polypeptide of the present Invention Is 
expressed as a soluble product in the host cells, the cells are collected by centrifugation after cultivation, suspended 
In an aqueous buffer, and disrupted using an ultrasonlcator, a French press, a Manton Gaulln homogenlzer, a Dynomill, 
or the like to obtain a cell-free extract. From the supernatant obtained by centrifuging the cell-free extract, a purified 

40 product can be obtained by the general method used for isolating and purifying an enzyme, for example, solvent ex- 
traction, salting out using ammonium sulfate or the like, desalting, precipitation using an organic solvent, anion ex- 
change chromatography using a resin, such as diethylamlnoethyl (DEAE)-Sepharose, DIAION HPA-75 (manufactured 
by Mitsubishi Chemical) or the like, cation exchange chromatography using a resin, such as S-Sepharose FF (manu- 
factured by Pharmacia) or the like, hydrophobic chromatography using a resin, such as butyl sepharose, phenyl sepha- 

45 rose or the like, gel filtration using a molecular sieve, affinity chromatography, chromatofocusing, or electrophoresis, 
such as isoelectronic focusing or the like, alone or in combination thereof. 

[0266] When the polypeptide is expressed as an Insoluble product in the host cells, the cells are collected in the 
same manner, disrupted and centrifuged to recover the insoluble product of the polypeptide as the precipitate fraction. 
Next, the insoluble product of the polypeptide is solublllzed with a protein denaturing agent. The solubillzed solution 
50 is diluted or dialyzed to lower the concentration of the protein denaturing agent in the solution. Thus, the normal con- 
figuration of the polypeptide is reconstituted. After the procedure, a purified product of the polypeptide can be obtained 
by a purification/isolation method similar to the above, 

[0267] When the polypeptide of the present invention or its derivative (for example, a polypeptide formed by adding 
a sugar chain thereto) is secreted out of cells, the polypeptide or its derivative can be collected in the culture supernatant. 
55 Namely, the culture supernatant Is obtained by treating the culture medium In a treatment similar to the above (for 
example, centrifugation). Then, a purified product can be obtained from the culture medium using a purification/isolation 
method similar to the above. 

[0268] The polypeptide obtained by the above method is within the scope of the polypeptide of the present invention, 
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and examples include a polypeptide encoded by a polynucleotide comprising the nucleotide sequence selected from 
SEQ ID N0S:2 to 3431, and a polypeptide comprising an amino acid sequence represented by any one of SEQ ID 
NOS:3502 to 6931 . 

[0269] Furthermore, a polypeptide comprising an amino acid sequence in which at least one amino acids is deleted, 

5 replaced, inserted or added in the amino acid sequence of the polypeptide and having substantially the same activity 
as that of the polypeptide is included in the scope of the present invention. The term "substantially the same activity 
as that of the polypeptide" means the same activity represented by the inherent function, enzyme activity or the like 
possessed by the polypeptide which has not been deleted, replaced, inserted or added. The polypeptide can be ob- 
tained using a method for introducing part-specific mutation(s) described in, for example, Molecular Cloning, 2nd ed., 

10 Current Protocols In Molecular Biology, Nuc. Acids. Res., 10: 6487 (1 982), Proc. Natl. Acad. Sci. USA, 79: 6409 (1 982), 
Gene, 34: 315 (1985), Nuc. Acids. Res., 13: 4431 (1985), Proc. Natl. Acad. Sci. USA, 82: 488 (1985) and the like. For 
example, the polypeptide can be obtained by introducing mutation(s) to DNA encoding a polypeptide having the amino 
acid sequence represented by any one of SEQ ID NOS:3502 to 6931 . The number of the amino acids which are deleted, 
replaced, inserted or added is not particularly limited; however, it is usually 1 to the order of tens, preferably 1 to 20, 

15 more preferably 1 to 1 0, and most preferably 1 to 5, amino acids. 

[0270] The at least one amino acid deletion, replacement, insertion or addition in the amino acid sequence of the 
polypeptide of the present invention is used herein to refer to that at least one amino acid is deleted, replaced, inserted 
or added to at one or plural positions in the amino acid sequence. The deletion, replacement, insertion or addition may 
be caused in the same amino acid sequence simultaneously. Also, the amino acid residue replaced, inserted or added 

20 can be natural or non-natural. Examples of the natural amino acid residue include L-alanine, L-asparagine, L-asparatic 
acid, L-glutamine, L-glutamic acid, glycine, L-histidine, L-isoleucine, L-leucine, L-lysine, L-methionine, L-phenylalanine, 
L-proline, L-serine, L-threonine, L-tryptophan, L-tyrosine, L-valine, L-cysteine, and the like. 

[0271] Herein, examples of amino acid residues which are replaced with each other are shown below. The amino 
acid residues in the same group can be replaced with each other. 

25 

Group A: 

[0272] leucine, isoleucine, norleucine, valine, norvaline, alanine, 2-aminobutanoic acid, methionine, O-methylserine, 
t-butylglycine, t-butylalanine, cyclohexylalanine; 

30 

Group B: 

[0273] asparatic acid, glutamic acid, isoasparatic acid, isoglutamic acid, 2-aminoadipic acid, 2-aminosuberic acid; 
35 Group C: 

[0274] asparagine, glutamine; 
Group D: 

40 

[0275] lysine, arginine, ornithine, 2,4-diaminobutanoic acid, 2,3-diaminopropionic acid; 
Group E: 

45 [0276] proline, 3-hydroxyproline, 4-hydroxyproline; 
Group F: 

[0277] serine, threonine, homoserine; 

50 

Group G: 

[0278] phenylalanine, tyrosine. 

[0279] Also, in order that the resulting mutant polypeptide has substantially the same activity as that of the polypeptide 
55 which has not been, mutated, it is preferred that the mutant polypeptide has a homology of 60% or more, preferably 
80% or more, and particularly preferably 95% or more, with the polypeptide which has not been mutated, when calcu- 
lated, for example, using default (initial setting) parameters by a homology searching software, such as BLAST, FASTA, 
or the like. 
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[0280] Also, the polypeptide of the present invention can be produced by a chemical synthesis method, such as 
Fmoc (fluorenylmethyioxycarbonyl) method, tBoc (t-butyloxycarbonyl) method, or the like. It can also be synthesized 
using a peptide synthesizer manufactured by Advanced ChemTech, Perkin-Elmer, Pharmacia, Protein Technology 
Instrument, Synthecell-Vega, PerSeptive, Shimadzu Corporation, or the like. 
5 [0281 ] The transformant of the present invention can be used for objects other than the production of the polypeptide 
of the present invention. 

[0282] Specifically, at least one component selected from an amino acid, a nucleic acid, a vitamin, a saccharide, an 
organic acid, and analogues thereof can be produced by culturing the transformant containing the polynucleotide or 
recombinant vector of the present invention in a medium to produce and accumulate at least one component selected 
10 from amino acids, nucleic acids, vitamins, saccharides, organic acids, and analogues thereof, and recovering the same 
from the medium. 

[0283] The biosynthesis pathways, decomposition pathways and regulatory mechanisms of physiologically active 
substances such as amino acids, nucleic acids, vitamins, saccharides, organic acids and analogues thereof differ from 
organism to organism. The productivity of such a physiologically active substance can be improved using these differ- 

15 ences, specifically by Introducing a heterogeneous gene relating to the biosynthesis thereof. For example, the content 
of lysine, which is one of the essential amino acids, in a plant seed was improved by introducing a synthase gene 
derived from a bacterium (WO 93/19190). Also, arginine is excessively produced in a culture by introducing an arginine 
synthase gene derived from Escherichia co// (Japanese Examined Patent Publication 23750/93). 
[0284] To produce such a physiologically active substance, the transformant according to the present Invention can 

20 be cultured by the same method as employed in culturing the transformant for producing the polypeptide of the present 
invention as described above. Also, the physiologically active substance can be recovered from the culture medium 
in combination with, for example, the ion exchange resin method, the precipitation method and other known methods. 
[0285] Examples of methods known to one of ordinary skill in the art include electroporation, calcium transfection, 
the protoplast method, the method using a phage, and the like, when the host is a bacterium; and microinjection, 

25 calcium phosphate transfection, the positively charged lipid-mediated method and the method using a virus, and the 
like, when the host is a eukaryote (Moiecufar Cloning, 2nd ed.; Specter et aL, Cells/a laboratory manual, Cold Spring 
Harbour Laboratory Press, 1 998)). Examples of the host include prokaryotes, lower eukaryotes (for example, yeasts), 
higher eukaryotes (for example, mammals), and cells isolated therefrom. As the state of a recombinant polynucleotide 
fragment present in the host cells, it can be integrated into the chromosome of the host. Alternatively, it can be Integrated 

30 into a factor (for example, a plasmid) having an independent replication unit outside the chromosome. These trans- 
formants are usable in producing the polypeptides of the present invention encoded by the ORF of the genome of 
Corynebacterium glutamicum, the polynucleotides of the present invention and fragments thereof. Alternatively, they 
can be used in producing arbitrary polypeptides under the regulation by an EMF of the present Invention. 

35 11. Preparation of antibody recognizing the polypeptide of the present invention 

[0286] An antibody which recognizes the polypeptide of the present invention, such as a polyclonal antibody, a mon- 
oclonal antibody, or the like, can be produced using, as an antigen, a purified product of the polypeptide of the present 
invention or a partial fragment polypeptide of the polypeptide or a peptide having a partial amino acid sequence of the 
40 polypeptide of the present invention. 

(1) Production of polyclonal antibody 

[0287] A polyclonal antibody can be produced using, as an antigen, a purified product of the polypeptide of the 
45 present invention, a partial fragment polypeptide of the polypeptide, or a peptide having a partial amino acid sequence 
of the polypeptide of the present invention, and immunizing an animal with the same. 

[0288] Examples of the animal to be immunized include rabbits, goats, rats, mice, hamsters, chickens and the like. 
[0289] A dosage of the antigen is preferably 50 to 100 \iq per animal. 

[0290] When the peptide is used as the antigen, it is preferably a peptide covalently bonded to a carrier protein, such 
50 as keyhole limpet haemocyanin, bovine thyroglobulin, or the like. The peptide used as the antigen can be synthesized 
by a peptide synthesizer. 

[0291] The administration of the antigen is, for example, carried out 3 to 10 times at the intervals of 1 or 2 weeks 
after the first administration. On the 3rd to 7th day after each administration, a blood sample is collected from the 
venous plexus of the eyeground, and it is confirmed that the serum reacts with the antigen by the enzyme immunoassay 
55 {Enzyme-linked Immunosorbent Assay (ELISA), Igaku Shoin (1976) ; Antibodies - A Laboratory Manual, Cold Spring 
Harbor Laboratory (1988)) or the like. 

[0292] Serum is obtained from the immunized non-human mammal with a sufficient antibody titer against the antigen 
used for the immunization, and the serum is isolated and purified to obtain a polyclonal antibody. 
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[0293] Examples of the method for the isolation and purification include centrifugation. salting out by 40-50% satu- 
rated ammonium sulfate, caprylic acid precipitation {Antibodies, A Laboratory manual. Cold Spring Harbor Laboratory 
(1988)), or chromatography using a DEAE-Sepharose column, an anion exchange column, a protein A- or G-column, 
a gel filtration column, and the like, alone or in combination thereof, by methods known to those of ordinary skill in the art. 

5 

(2) Production of monoclonal antibody 

(a) Preparation of antibody-producing cell 

10 [0294] A rat having a serum showing an enough antibody titer against a partial fragment polypeptide of the polypep- 
tide of the present invention used for immunization is used as a supply source of an antibody-producing cell 
[0295] On the 3rd to 7th day after the antigen substance is finally administered the rat showing the antibody titer, the 
spleen is excised. 

[0296] The spleen is cut to pieces in MEM medium (manufactured by Nissui Pharmaceutical), loosened using a pair 
IS of forceps, followed by centrifugation at 1 ,200 rpm for 5 minutes, and the resulting supernatant is discarded. 

[0297] The spleen in the precipitated fraction is treated with a Tris-ammonium chloride buffer (pH 7.65) for 1 to 2 
minutes to eliminate erythrocytes and washed three times with MEM medium, and the resulting spleen cells are used 
as antibody-producing cells. 

20 (b) Preparation of myeloma cells 

[0298] As myeloma cells, an established cell line obtained from mouse or rat is used. Examples of useful cell lines 
include those derived from a mouse, such as P3-X63Ag8-U1 (hereinafter referred to as "P3-U1 ") {Curr. Topics in Micro- 
biol. Immunol,, 81: 1 (1978); Europ, J. Immunol., &. 511 (1976)); SP2/0-Agl4 (SP-2) {Nature, 27&. 269 (1978)): 

25 P3-X63-Ag8653 (653) {J. Immunol., 123: 1548 (1979)); P3-X63-Ag8 (X63) cell line {Nature, 25e. 495 (1975)), and the 
like, which are 8-azaguanine-resistant mouse (BALB/c) myeloma cell lines. These cell lines are subcultured In 8-aza- 
guanine medium (medium in which, to a medium obtained by adding 1.5 mmol/l glutamine, 5x10-5 mol/l 2-mercap- 
toethanol, 10 |ig/ml gentamicin and 10% fetal calf serum (PCS) (manufactured by CSL) to RPMI-1640 medium (here- 
inafter referred to as the "normal medium"), 8-azaguanine is further added at 15 |ig/ml) and cultured in the normal 

30 medium 3 or 4 days before cell fusion, and 2x 1 0^ or more of the cells are used for the fusion. 

(c) Production of hybridoma 

[0299] The antibody-producing cells obtained in (a) and the myeloma cells obtained in (b) are washed with MEM 
35 medium or PBS (disodium hydrogen phosphate: 1 .83 g, sodium dihydrogen phosphate: 0.21 g, sodium chloride: 7.65 
g, distilled water: 1 liter, pH: 7.2) and mixed to give a ratio of antibody-producing cells : myeloma cells = 5 : 1 to 10 : 
1 , followed by centrifugation at 1 ,200 rpm for 5 minutes, and the supernatant is discarded. 

[0300] The cells in the resulting precipitated fraction were thoroughly loosened, 0.2 to 1 ml of a mixed solution of 2 
g of polyethylene glycol-1000 (PEG-1000), 2 ml of MEM medium and 0.7 ml of dimethylsulfoxide (DMSO) per 10^ 
40 antibody-producing cells is added to the cells under stirring at 37'='C, and then 1 to 2 ml of MEM medium is further 
added thereto several times at 1 to 2 minute intervals. 

[0301] After the addition. MEM medium is added to give a total amount of 50 ml. The resulting prepared solution is 
centrifuged at 900 rpm for 5 minutes, and then the supernatant is discarded. The cells in the resulting precipitated 
fraction were gently loosened and then gently suspended in 100 ml of HAT medium (the normal medium to which 10"4 
45 mol/l hypoxanthine, 1.5x10-5 mol/l thymidine and 4x10-^ mol/l aminopterin have been added) by repeated drawing 
up into and discharging from a measuring pipette. 

[0302] The suspension is poured into a 96 well culture plate at 100 ^il/well and cultured at 37°C for 7 to 14 days in 
a 5% CO2 incubator 

[0303] After culturing, a part of the culture supernatant is recovered, and a hybridoma which specifically reacts with 
50 a partial fragment polypeptide of the polypeptide of the present invention is selected according to the enzyme immu- 
noassay described in Antibodies, A Laboratory manual. Cold Spring Harbor Laboratory, Chapter 1 4 (1 998) and the like. 
[0304] A specific example of the enzyme immunoassay is described below. 

[0305] The partial fragment polypeptide of the polypeptide of the present invention used as the antigen in the immu- 
nization is spread on a suitable plate, is allowed to react with a hybridoma culturing supernatant or a purified antibody 
55 obtained in (d) described below as a first antibody, and is further allowed to react with an anti-rat or anti-mouse immu- 
noglobulin antibody labeled with an enzyme, a chemical luminous substance, a radioactive substance or the like as a 
second antibody for reaction suitable for the labeled substance. A hybridoma which specifically reacts with the polypep- 
tide of the present invention is selected as a hybridoma capable of producing a monoclonal antibody of the present 
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invention. 

[0306] Cloning is repeated using the hybridoma twice by limiting dilution analysis (HT medium (a medium in which 
aminopterin has been removed from HAT medium) is firstly used, and the normal medium is secondly used), and a 
hybridoma which is stable and contains a sufficient amount of antibody titer is selected as a hybridoma capable of 
5 producing a monoclonal antibody of the present invention. 

(d) Preparation of monoclonal antibody 

[0307] The monoclonal antibody-producing hybridoma cells obtained in (c) are injected intraperitoneally into 8- to 
10 10-week-old mice or nude mice treated with pristane (intraperitoneal administration of 0.5 ml of 2,6, 10,1 4-tetrameth- 
ylpentadecane (pristane), followed by 2 weeks of feeding) at 5x10^ to 20x1 0^ cells/animal. The hybridoma causes 
ascites tumor in 10 to 21 days. 

[0308] The ascitic fluid is collected from the mice or nude mice, and centrifuged to remove solid contents at 3000 
rpm for 5 minutes. 

15 [0309] A monoclonal antibody can be purified and isolated from the resulting supernatant according to the method 
similar to that used in the polyclonal antibody. 

[0310] The subclass of the antibody can be determined using a mouse monoclonal antibody typing kit or a rat mon- 
oclonal antibody typing kit. The polypeptide amount can be determined by the Lowry method or by calculation based 
on the absorbance at 280 nm. 

20 [0311] The antibody obtained in the above is within the scope of the antibody of the present invention. 

[0312] The antibody can be used for the general assay using an antibody, such as a radioactive material labeled 
immunoassay (RIA), competitive binding assay, an immunotissue chemical staining method (ABC method, CSA meth- 
od, etc.), immunoprecipitation, Western blotting, ELISA assay, and the like {An introduction to Radioimmunoassay and 
Related Techniques, Elsevier Science (1986); Tecfiniques in immunocytochemistry, Academic Press, Vol. 1 (1982), 

25 Vol. 2 (1 983) & Vol. 3 (1 985); Practice and Theory of Enzyme immunoassays, Elsevier Science (1 985); Enzyme-lin/<ed 
Irrimunosorbent Assay (ELISA), Igaku Shoin (1976) ; Antibodies - A Laboratory Manual, Cold Spring Harbor laboratory 
(1988); Monoclonal Antibody Experiment Manual, Kodansha Scientific (1987); Second Series Biochemical Experiment 
Course, Vol. 5, Immunobiochemistry Research Method, Tokyo Kagaku Dojin (1986)). 
[0313] The antibody of the present invention can be used as it is or after being labeled with a label. 

30 [0314] Examples of the label include radioisotope, an affinity label (e.g., biotin, avidin, or the like), an enzyme label 
(e.g., horseradish peroxidase, alkaline phosphatase, or the like), a fluorescence label (e.g., FITC, rhodamine, orthe 
like), a label using a rhodamine atom, {J. Histochem. Cytochem., 18: 315 (1970); Meth. Enzym., 62. 308 (1979); Im- 
munol., 109. 129 (1972); J. Immunol., Meth,, 13: 215 (1979)), and the like. 

[031 5] Expression of the polypeptide of the present invention, fluctuation of the expression, the presence or absence 
35 of structural change of the polypeptide, and the presence or absence in an organism other than coryneform bacteria 
of a polypeptide corresponding to the polypeptide can be analyzed using the antibody or the labeled antibody by the 
above assay, or a polypeptide array or proteome analysis described below. 

[0316] Furthermore, the polypeptide recognized by the antibody can be purified by immunoaffinity chromatography 
using the antibody of the present invention. 

40 

12. Production and use of polypeptide array 

(1) Production of polypeptide array 

45 [0317] A polypeptide array can be produced using the polypeptide of the present invention obtained in the above 
item 10 or the antibody of the present invention obtained in the above item 11 . 

[0318] The polypeptide array of the present invention includes protein chips, and comprises a solid support and the 
polypeptide or antibody of the present invention adhered to the surface of the solid support. 

[0319] Examples of the solid support include plastic such as polycarbonate or the like; an acrylic resin, such as 
50 polyacryiamide or the like; complex carbohydrates, such as agarose, sepharose, or the like; silica; a silica-based ma- 
terial, carbon, a metal, inorganic glass, latex beads, and the like. 

[0320] The polypeptides or antibodies according to the present invention can be adhered to the surface of the solid 
support according to the nriethod described in Biotechniques, 27: 1258-61 (1 999); Molecular Medicine Today, 5: 326-7 
(1999); Handbook of Experimental Immunology, 4th edition, Blackwell Scientific Publications, Chapter 10 (1986); Meth. 
55 Enzym., 34 (1974); Advances in Experimental Medicine and Biology, 42 (1974); U.S. Patent 4,681 ,870; U.S. Patent 
4,282,287; U.S. Patent 4,762,881, or the like. 

[0321] The analysis described herein can be efficiently performed by adhering the polypeptide or antibody of the 
present invention to the solid support at a high density, though a high fixation density is not always necessary. 
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(2) Use of polypeptide array 

[0322] A. polypeptide or a compound capable of binding to and interacting with the polypeptides of the present in- 
vention adhered to the array can be identified using the polypeptide array to which the polypeptides of the present 
5 invention have been adhered thereto as described in the above (1). 

[0323] Specifically, a polypeptide or a compound capable of binding to and interacting with the polypeptides of the 
present invention can be identified by subjecting the polypeptides of the present invention to the following steps (i) to (iv): 

(i) preparing a polypeptide array having the polypeptide of the present invention adhered thereto by the method 
10 of the above (1); 

(ii) incubating the polypeptide immobilized on the polypeptide array together with at least one of a second polypep- 
tide or compound; 

(iii) detecting any complex formed between the at least one of a second polypeptide or compound and the polypep- 
tide immobilized on the array using, for example, a label bound to the at least one of a second polypeptide or 

15 compound, or a secondary label which specifically binds to the complex or to a component of the complex after 

unbound material has been removed; and 

(iv) analyzing the detection data. 

[0324] Specific examples of the polypeptide array to which the polypeptide of the present invention has been adhered 
20 include a polypeptide array containing a solid support to which at least one of a polypeptide containing an amino acid 
sequence selected from SEQ ID NOS:3502 to 7001, a polypeptide containing an amino acid sequence in which at 
least one amino acids is deleted, replaced, inserted or added in the amino acid sequence of the polypeptide and having 
substantially the same activity as that of the polypeptide, a polypeptide containing an amino acid sequence having a 
homology of 60% or more with the amino acid sequences of the polypeptide and having substantially the same activity 
25 as that of the polypeptides, a partial fragment polypeptide, and a peptide comprising an amino acid sequence of a part 
of a polypeptide. 

[0325] The amount of production of a polypeptide derived from coryneform bacteria can be analyzed using a polypep- 
tide array to which the antibody of the present invention has been adhered in the above (1). 

[0326] Specifically, the expression amount of a gene derived from a mutant of coryneform bacteria can be analyzed 
30 by subjecting the gene to the following steps (i) to (iv): 

(i) preparing a polypeptide array by the method of the above (1); 

(ii) incubating the polypeptide array (the first antibody) together with a polypeptide derived from a mutant of co- 
ryneform bacteria; 

35 (iii) detecting the polypeptide bound to the polypeptide immobilized on the array using a labeled second antibody 

of the present invention; and 
(iv) analyzing the detection data. 

[0327] Specific examples of the polypeptide array to which the antibody of the present invention is adhered include 
40 a polypeptide array comprising a solid support to which at least one of an antibody which recognizes a polypeptide 
comprising an amino acid sequence selected from SEQ ID NOS:3502 to 7001 , a polypeptide comprising an amino 
acid sequence in which at least one amino acids is deleted, replaced, inserted or added in the amino acid sequence 
of the polypeptide and having substantially the same activity as that of the polypeptide, a polypeptide comprising an 
amino acid sequence having a homology of 60% or more with the amino acid sequences of the polypeptide and having 
45 substantially the same activity as that of the polypeptides, a partial fragment polypeptide, or a peptide comprising an 
amino acid sequence of a part of a polypeptide. 

[0328] A fluctuation In an expression amount of a specific polypeptide can be monitored using a polypeptide obtained 
in the time course of culture as the polypeptide derived from coryneform bacteria. The culturing conditions can be 
optimized by analyzing the fluctuation. 
50 [0329] When a polypeptide derived from a mutant of coryneform bacteria is used, a mutated polypeptide can be 
detected. 

13. Identification of useful mutation in mutant by proteome analysis 

55 [0330] Usually, the proteome is used herein to refer to a method wherein a polypeptide is separated by twodimen- 
sional electrophoresis and the separated polypeptide is digested with an enzyme, followed by identification of the 
polypeptide using a mass spectrometer (MS) and searching a data base. 

[0331] The two dimensional electrophoresis means an electrophoretic method which is performed by combining two 
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electrophoretic procedures having different principles. For example, polypeptides are separated depending on molec- 
ular weight In the primary electrophoresis. Next, the gel Is rotated by 90° or 180"* and the secondary electrophoresis 
is carried out depending on isoelectric point. Thus, various separation patterns can be achieved (JIS K 3600 2474). 
[0332] In searching the data base, the amino acid sequence information of the polypeptides of the present invention 
5 and the recording medium of the present invention provide for in the above items 2 and 8 can be used. 

[0333] The proteome analysis of a coryneform bacterium and Its mutant makes it possible to identify a polypeptide 
showing a fluctuation therebetween. 

[0334] The proteome analysis of a wild type strain of coryneform bacteria and a production strain showing an Im- 
proved productivity of a target product makes it possible to efficiently identify a mutation protein which is useful in 

10 breeding for improving the productivity of a target product or a protein of which expression amount Is fluctuated. 

[0335] Specifically, a wild type strain of coryneform bacteria and a lysine-producing strain thereof are each subjected 
to the proteome analysis. Then, a spot Increased In the lysine-producing strain, compared with the wild type strain, is 
found and a data base is searched so that a polypeptide showing an increase in yield in accordance with an Increase 
in the lysine productivity can be Identified. For example, as a result of the proteome analysis on a wild type strain and 

15 a lysine-producing strain, the productivity of the catalase having the amino acid sequence represented by SEQ ID NO: 
3785 Is Increased in the lysine-producing mutant. 

[0336] As a result that a protein having a high expression level Is identified by proteome analysis using the nucleotide 
sequence Information and the amino acid sequence Information, of the genome of the coryneform bacteria of the 
present invention, and a recording medium storing the sequences, the nucleotide sequence of the gene encoding this 
20 protein and the nucleotide sequence in the upstream thereof can be searched at the same time, and thus, a nucleotide 
sequence having a high expression promoter can be efficiently selected. 

[0337] In the proteome analysis, a spot on the two-dimentional electrophoresis gel showing a fluctuation is sometimes 
derived from a modified protein. However, the modified protein can be efficiently identified using the recording medium 
storing the nucleotide sequence information, the amino acid sequence information, of the genome of coryneform bac- 

25 teria, and the recording medium storing the sequences, according to the present invention. 

[0338] Moreover, a useful mutation point in a useful mutant can be easily specified by searching a nucleotide se- 
quence (nucleotide sequence of promoters, ORF, or the like) relating to the thus identified protein using a recording 
medium storing the nucleotide sequence information and the amino acid sequence Information, of the genome of 
coryneform bacteria of the present invention, and a recording medium storing the sequences and using a primer de- 

30 signed on the basis of the detected nucleotide sequence. As a result that the useful mutation point is specified, an 
industrially useful mutant having the useful mutation or other useful mutation derived therefrom can be easily bred. 
[0339] The present invention will be explained in detail below based on Examples. However, the present Invention 
is not limited thereto. 

35 Example 1 

Determination of the full nucleotide sequence of genome of Corynebacterium glutamicum 

[0340] The full nucleotide sequence of the genome of Corynebacterium glutamicum was determined based on the 
40 whole genome shotgun method {Science, 269: 496-512 (1995)). In this method, a genome library was prepared and 
the terminal sequences were determined at random. Subsequently, these sequences were llgated on a computer to 
cover the full genome. Specifically, the following procedure was carried out. 

(1) Preparation of genome DNA of Corynebacterium glutamicum ATCC 13032 

45 

[0341 ] Corynebacterium glutamicum ATCC 1 3032 was cultured In BY medium (7 g/l meat extract, 1 0 g/1 peptone, 3 
g/1 sodium chloride, 5 g/l yeast extract, pH 7.2) containing 1% of glycine at 30°C overnight and the cells were collected 
by centrifugatlon. After washing with STE buffer (10.3% sucrose, 25 mmol/l Tris hydrochloride, 25 mmol/1 EDTA, pH 
8.0), the cells were suspended in 10 ml of STE buffer containing 10 mg/ml lysozyme, followed by gently shaking at 

50 37*'C for 1 hour. Then, 2 ml of 1 0% SDS was added thereto to lyse the cells, and the resultant mixture was maintained 
at 65*'C for 1 0 minutes and then cooled to room temperature. Then, 1 0 ml of Tris-neutralized phenol was added thereto, 
followed by gently shaking at room temperature for 30 minutes and centrifugatlon (1 5,000 x g, 20 minutes, 20°C), The 
aqueous layer was separated and subjected to extraction with phenol/chloroform and extraction with chloroform (twice) 
In the same manner. To the aqueous layer, 3 mol/l sodium acetate solution (pH 5.2) and Isopropanol were added at 

55 1/10 times volume and twice volume, respectively, followed by gently stirring to precipitate the genome DNA. The 
genome DNA was dissolved again in 3 ml of TE buffer (1 0 mmol/l Tris hydrochloride, 1 mmol/l EDTA, pH 8.0) containing 
0.02 mg/ml of RNase and maintained at 37°C for 45 minutes. The extractions with phenol, phenol/chloroform and 
chloroform were carried out successively in the same manner as the above. The genome DNA was subjected to iso- 
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propanol precipitation. The thus formed genome DNA precipitate was washed with 70% ethanot three times, followed 
by air-drying, and dissolved in 1.25 ml of TE buffer to give a genome DNA solution (concentration: 0.1 mg/ml). 

(2) Construction of a shotgun library 

5 

[0342] TE buffer was added to 0.01 mg of the thus prepared genome DNA of Corynebacterium glutamicum ATCC 
13032 to give a total volume of 0.4 ml, and the mixture was treated with a sonicator (Yamato Powersonic Model 150) 
at an output of 20 continuously for 5 seconds to obtain fragments of 1 to 10 kb. The genome fragments were blunt- 
ended using a DNA blunting kit (manufactured by Takara Shuzo) and then fractionated by 6% polyacrylamide gel 

10 electrophoresis. Genome fragments of 1 to 2 kb were cut out from the gel, and 0.3 ml MG elution buffer (0.5 mol/l 
ammonium acetate, 1 0 mmol/l magnesium acetate, 1 mmol/l EDTA, 0. 1 % SDS) was added thereto, followed by shaking 
at 37°C overnight to elute DNA. The DNA eluate was treated with phenol/chloroform, and then precipitated with ethanol 
to obtain a genome library insert. The total insert and 500 ng of pUC18 Sma//BAP (manufactured by Amersham Phar- 
macia Biotech) were ligated at 16°C for 40 hours. 

15 [0343] The ligation product was precipitated with ethanol and dissolved in 0.01 ml of TE buffer. The ligation solution 
(0.001 ml) was introduced into 0.04 ml of E. co// ELECTRO MAX DH10B (manufactured by Life Technologies) by the 
electroporation under conditions according to the manufacture's instructions. The mixture was spread on LB plate 
medium (LB medium (10 g/l bactotrypton, 5 g/l yeast extract, 10 g/l sodium chloride, pH 7.0) containing 1.6% of agar) 
containing 0.1 mg/ml ampicillin, 0.1 mg/ml X-gal and 1 mmol/l isopropyl-p-D-thiogalactopyranoside (IPTG) and cultured 

20 at 37°C overnight. 

[0344] The transformant obtained from colonies formed on the plate medium was stationarily cultured in a 96-weIl 
titer plate having 0.05 ml of LB medium containing 0.1 mg/ml ampicillin at 37°C overnight. Then, 0.05 ml of LB medium 
containing 20% glycerol was added thereto, followed by stirring to obtain a glycerol stock. 

25 (3) Construction of cosmid library 

[0345] About 0.1 mg of the genome DNA of Corynebacterium glutamicum ATCC 13032 was partially digested with 
Sau3A\ (manufactured by Takara Shuzo) and then ultracentrifuged (26,000 rpm, 18 hours, 20°C) under 10 to 40% 
sucrose density gradient obtained using 10% and 40% sucrose buffers (1 mol/l NaCl, 20 mmol/l Tris hydrochloride, 5 

30 mmol/l EDTA, 1 0% or 40% sucrose, pH 8.0). After the centrifugation, the solution thus separated was fractionated into 
tubes at 1 ml in each tube. After confirming the DNA fragment length of each fraction by agarose gel electrophoresis, 
a fraction containing a large amount of DNA fragment of about 40 kb was precipitated with ethanol. 
[0346] The DNA fragment was ligated to the SamHI site of superCosI (manufactured by Stratagene) in accordance 
with the manufacture's instructions. The ligation product was incorporated into Escherichia coli XL-1 -BlueMR strain 

35 (manufactured by Stratagene) using Gigapack 111 Gold Packaging Extract (manufactured by Stratagene) in accordance 
with the manufacture's instructions. The Escherichia colivjas spread on LB plate medium containing 0.1 mg/ml amp- 
icillin and cultured therein at 37°C overnight to isolate colonies. The resulting colonies were stationarily cultured at 
37°C overnight in a 96-well titer plate containing 0.05 ml of the LB medium containing 0.1 mg/ml ampicillin in each 
well. LB medium containing 20% glycerol (0.05 ml) was added thereto, followed by stirring to obtain a glycerol stock. 

40 

(4) Determination of nucleotide sequence 
(4-1 ) Preparation of template 

45 [0347] The full nucleotide sequence of Corynebacterium giutamicum ATCC 13032 was determined mainly based on 
the whole genome shotgun method. The template used in the whole genome shotgun method was prepared by the 
PCR method using the library prepared in the above (2). 

[0348] Specifically, the clone derived from the whole genome shotgun library was inoculated using a replicator (man- 
ufactured by GENETIX) into each well of a 96-well plate containing the LB medium containing 0.1 mg/ml of ampicillin 

50 at 0.08 ml per each well and then stationarily cultured at 37°C overnight. 

[0349] Next, the culturing solution was transported using a copy plate (manufactured by Tokken) into a 96-well re- 
action plate (manufactured by PE Biosystems) containing a PCR reaction solution (TaKaRa Ex Taq (manufactured by 
Takara Shuzo)) at 0.08 ml per each well. Then, PCR was carried out in accordance with the protocol by Makino etal. 
{DNA Research, 5: 1-9 (1998)) using GeneAmp PCR System 9700 (manufactured by PE Biosystems) to amplify the 

55 inserted fragment. 

[0350] The excessive primers and nucleotides were eliminated using a kit for purifying a PCR production (manufac- 
tured by Amersham Pharmacia Biotech) and the residue was used as the template in the sequencing reaction. 
[0351] Some nucleotide sequences were determined using a double-stranded DNA plasmid as a template. 
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[0352] The double-stranded DNA plasmid as the template was obtained by the following method. 
[0353] The clone derived from the whole genome shotgun library was inoculated into a 24- or 96-well plate containing 
a 2x YT medium (1 6 g/l bactotrypton, 1 0 g/l yeast extract, 5 g/l sodium chloride, pH 7.0) containing 0.05 mg/ml ampicillin 
at 1 .5 ml per each well and then cultured under shaking at 37°C overnight. 
5 [0354] The double-stranded DNA plasmid was prepared from the culturing solution using an automatic plasmid pre- 
paring machine, KURABO PI-50 (manufactured by Kurabo Industries) or a multiscreen (manufactured by Millipore) In 
accordance with the protocol provided by the manufacturer. 

[0355] To purify the double-stranded DNA plasmid using the multiscreen, Biomek 2000 (manufactured by Beckman 
Coulter) or the like was employed. 
10 [0356] The thus obtained double-stranded DNA plasmid was dissolved in water to give a concentration of about 0.1 
mg/ml and used as the template in sequencing. 

(4-2) Sequencing reaction 

15 [0357] To 6 111 of a solution of ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit (manufactured 
by PE Biosystems), an M13 regular direction primer (M13-21) or an M13 reverse direction primer (M13REV) {DNA 
Research, 5: 1-9 (1998) and the template prepared in the above (4-1) (the PCR product or the plasmid) were added 
to give 10 |xl of a sequencing reaction solution. The primers and the templates were used in an amount of 1 .6 pmol 
and an amount of 50 to 200 ng, respectively. 

20 [0358] Dye terminator sequencing reaction of 45 cycles was carried out with GeneAmp PCR System 9700 (manu- 
factured by PE Biosystems) using the reaction solution. The cycle parameter was determined in accordance with the 
manufacturer's instruction accompanying ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit. The 
sample was purified using Multiscreen HV plate (manufactured by Millipore) according to the manufacture's instruc- 
tions. The thus purified reaction product was precipitated with ethanol, followed by drying, and then stored in the dark 

25 at -30^C. 

[0359] The dry reaction product was analyzed by ABI PRISM 377 DNA Sequencer and ABil PRISM 3700 DNA An- 
alyzer (both manufactured by PE Biosystems) each in accordance with the manufacture's instructions. 
[0360] The data of about 50,000 sequences in total (i.e., about 42,000 sequences obtained using 377 DNA Sequenc- 
er and about 8,000 reactions obtained by 3700 DNA Analyser) were transferred to a server (Alpha Server 4100: man- 
30 ufactured by COMPAQ) and stored. The data of these about 50,000 sequences corresponded to 6 times as much as 
the genome size. 

(5) Assembly 

35 [0361] All operations were carried out on the basis of UNIX platform. The analytical data were output in Macintosh 
platform using X Window System. The base call was carried out using phred (The University of Washington). The 
vector sequence data was deleted using SPS Cross_Match (manufactured by Southwest Parallel Software). The as- 
sembly was carried out using SPS phrap (manufactured by Southwest Parallel Software; a high-speed version of phrap 
(The University of Washington)). The contig obtained by the assembly was analyzed using a graphical editor, consed 

40 (The University of Washington). A series of the operations from the base call to the assembly were carried out simul- 
taneously using a script phred Phrap attached to consed. 

(6) Determination of nucleotide sequence in gap part 

45 [0362] Each cosmid in the cosmid library constructed in the above (3) was prepared by a method similar to the 
preparation of the double-stranded DNA plasmid described in the above (4-1). The nucleotide sequence at the end of 
the inserted fragment of the cosmid was determined by using ABI PRISM BigDye Terminator Cycle Sequencing Ready 
Reaction Kit (manufactured by PE Biosystems) according to the manufacture's instructions. 

[0363] About 800 cosmid clones were sequenced at both ends to search a nucleotide sequence in the contig derived 
50 from the shotgun sequencing obtained in the above (5) coincident with the sequence. Thus, the linkage between re- 
spective cosmid clones and respective contigs were determined and mutual alignment was carried out. Furthermore, 
the results were compared with the physical map of Corynebacterium glutamicum ATCC 13032 (MoL Gen. Genet, 
252. 255-265 (1996) to carrying out mapping between the cosmids and the contigs. 

[0364] The sequence in the region which was not covered with the contigs was determined by the following method. 
55 [0365] Clones containing sequences positioned at the ends of contigs were selected. Among these clones, about 
1 ,000 clones wherein only one end of the inserted fragment had been determined were selected and the sequence at 
the opposite end of the inserted fragment was determined. A shotgun library clone or a cosmid clone containing the 
sequences at the respective ends of the inserted fragment in two contigs was identified, the full nucleotide sequence 
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of the inserted fragment of this clone was determined, and thus the nucleotide sequence of the gap part was determined. 
When no shotgun library clone or cosmid clone covering the gap part was available, primers complementary to the 
end sequences at the two contigs were prepared and the DNA fragment in the gap part was amplified by PGR. Then, 
sequencing was performed by the primer walking method using the amplified DNA fragment as a template or by the 
5 shotgun method in which the sequence of a shotgun clone prepared from the amplified DNA fragment was determined. 
Thus, the nucleotide sequence of the domain was determined. 

[0366} In a region showing a low sequence precision, primers were synthesized using AUTOFINISH function and 
NAVIGATING function of consed (The University of Washington) and the sequence was determined by the primer 
walking method to improve the sequence precision. The thus determined full nucleotide sequence of the genome of 
10 Corynebacterium glutamicum ATCC 13032 strain is shown in SEQ ID NO:1 . 

(7) Identification of ORF and presumption of its function 

[0367] ORFs in the nucleotide sequence represented by SEQ ID NO:1 were identified according to the following 
15 method. First, the ORF regions were determined using software for identifying ORF, i.e., Glimmer, GeneMark and 
GeneMark.hmm on UNIX platform according to the respective manual attached to the software. 
[0368] Based on the data thus obtained, ORFs in the nucleotide sequence represented by SEQ ID NO:1 were Iden- 
tified. 

[0369] The putative function of an ORF was determined by searching the homology of the identified amino acid 
20 sequence of the ORF against an amino acid database consisting of protein-encoding domains derived from Swiss- 
Prot, PIR or Genpept database constituted by protein encoding domains derived from GenBank database. Frame 
Search (manufactured by Gompugen), or by searching the homology of the identified amino acid sequence of the ORF 
against an amino acid database consisting of protein-encoding domains derived from Swiss-Prot, PIR or Genpept 
database constituted by protein encoding domains derived from GenBank database, BLAST. The nucleotide sequences 
25 of the thus determined ORFs are shown in SEQ ID NOS:2 to 3501 , and the amino acid sequences encoded by these 
ORFs are shown in SEQ ID NOS:3502 to 7001 . 

[0370] In some cases of the sequence listings in the present invention, nucleotide sequences, such as TTG, TGT, 
GGT, and the like, other than ATG, are read as an initiating codon encoding Met. 

[0371] Also, the preferred nucleotide sequences are SEQ ID NOS:2 to 355 and 357 to 3501 , and the preferred amino 

30 acid sequences are shown in SEQ ID NOS:3502 to 3855 and 3857 to 7001 

[0372] Table 1 shows the registration numbers in the above-described databases of sequences which were judged 
as having the highest homology with the nucleotide sequences of the ORFs as the results of the homology search in 
the amino acid sequences using the homology-searching software Frame Search (manufactured by Gompugen), 
names of the genes of these sequences, the functions of the genes, and the matched length, identities and analogies 

35 compared with publicly known amino acid translation sequences. Moreover, the corresponding positions were con- 
firmed via the alignment of the nucleotide sequence of an arbitrary ORF with the nucleotide sequence of SEQ ID NO: 
1 . Also, the positions of nucleotide sequences other than the ORFs (for example, ribosomal RNA genes, transfer RNA 
genes, IS sequences, and the like) on the genome were determined. 

[0373] Fig. 1 shows the positions of typical genes of the Corynebacterium glutamicum ATGG 13032 on the genome. 

40 



45 



50 



55 



35 



EP 1 108 790 A2 



5 
10 


Function 


replication initiation protein DnaA 




DNA polymerase III beta chain 


DNA replication protein (recF 
protein) 


hypothetical protein 


i DNA topoisomerase (ATP- 
hydrolyzing) 










NAGC/XYLR repressor 






DNA gyrase subunit A 


hypothetical membrane protein 


hypothetical protein 


bacterial regulatory protein. LysR 
type 




cytochrome c biogenesis protein 


hypothetical protein 


1 repressor 


15 


Matched 
length 
(a.a) 


CN 
IT) 




o 

CO 


CN 
C7) 
CO 




o 










CN 
CN 






in 

CO 


CN 


C3> 
CN 
CO 


00 
CO 
CN 




in 

CD 
CN 


m 
m 




20 


Similarity 
(%) 


1 99.8 




81.8 


79.9 


58.1 


88.9 










50.7 






88.1 


69.6 


63.5 


62.3 




67.4 


64.5 


T — 
o 




Identity 
(%) 


CO 
a> 
o> 




50.5 


53.3 


35.1 


71.9 










29.4 






70.4 


29.5 


33.7 


27.6 




29.1 


31.6 


36.8 


25 

30 »5 

35 


Honnologous gene 


Brevibacterium flavum dnaA 




Mycobacterium smegmatis dnaN 


Mycobacterium smegmatis recF 


Streptomyces coelicolor yreG 


Mycobacterium tuberculosis 
H37RV gyrB 










Mycobacterium tuberculosis 
H37RV 






Mycobacterium tuberculosis 
H37RV RvOOOe gyrA 


Mycobacterium tuberculosis 
H37RV Rv0007 


Escherichia coli K12 yeiH 


Hydrogenophilus thermoluteolus 
TH-1 CbbR 




Rhodobacter capsulatus ccdA 


Coxielta burnetii com1 


Mycobacterium tuberculosis 
H37RvRv1846c 


40 


db Match 


gsp:R98523 




CO 

o 
> 

CO 
Q- 
Q 

CL 
</> 


CO 

o 
> 

1 

LL 
O 
LU 

q: 

CL 
CO 


sp:YREG_STRCO 


pir:S44198 










r) 
1- 
o 
> 

> 
> 

iA 






h- 
U 
>- 

<' 

>- 
CD 

CL 


pirE70698 


—J 
o 
a 

LU 

1 

X 
LU 

> 

CL 
«/) 


Oi 

CO 
CN 

o 

CO 

< 




gp:AF156103_2 


pir:A49232 


pir:F70664 




ORF 

(Dp) 


1572 


CN 
CO 


1182 


1182 


CO 

in 


2133 


CD 
CO 
CD 


ay 

CO 
CD 


o 
in 




1071 


CO 
CN 


CO 

CN 


2568 


CN 

cn 


1035 


00 


o 

CN 

TT 


o 

CO 


CN 
CD 


a> 

CO 
CO 


45 


Terminal 
(nt) 


1572 


1597 


3473 


4766 


5299 


7486 


8795 


8798 


10071 


9474 


10107 


11263 


11523 


14398 


14745 


15209 


17207 


17670 


17860 


18736 


20073 


50 


Initial 
(nt) 




1920 


2292 


3585 


4766 


5354 


7830 


9466 


9562 


9914 


11177 


11523 


' 11768 


, 11831 


14405 


16243 


16314 


17251 


18729 


19497 


19705 




SEQ 
NO. 
(a.a.) 


3502 


3503 


3504 


3505 


3506 


3507 


3508 


3509 


3510 


3511 


3512 


3513 


3514 


3515 


3516 


3517 


3518 


3519 


3520 1 


3521 


3522 i 


55 


SEQ 
NO. 
(DNA) 


CN 


CO 




m 


CO 




03 


1 

1 ^ 


o 




CN 


CO 




in 


CD 




oo 


CD 


o 

CN 


1 ^ 


CN 
CN 



36 



EP 1 108 790 A2 



5 
10 


Function 

1 


hypothetical membrane protein 


2.5-diketo-D-gluconic acid reductase 


5 -nucleotidase precursor 


5-nucIeolidase family protein 


transposase 


organic hydroperoxide detoxication 
enzyme 


ATP-dependent DNA helicase 




glucan 1,4-alpha-glucosidase 


lipoprotein 


ABC 3 transport family or integral 
membrane protein 


iron{lll) dicitrate transport ATP- 
biding protein 


sugar ABC transporter, periplasmic 
sugar-binding protein 


high affinity ribose transport protein 


ribose transport ATP-binding protein 


neurofilament subunit NF-1B0 


peptidyi-prolyl cis-trans isomerase A 


hypothetical membrane protein | 


15 


Matched 
length 
(aa) 


m 


<£> 


<D 

cn 


o 

CM 


5> 


C3) 
CO 


CN 






CO 


CO 
CO 
CN 


CN 

Si 


cn 

CO 
CN 


CN 

cn 


CO 

cn 

CN 


cn 


a> 
to 


to 

CN 
CN 


20 


Similarity 
(%) 


50.8 


88.5 


56.1 


66.7 


72.6 


79.9 


OO 

o 

CO 




m 


63.7 


74.1 


70.3 


in 

CD 

m 


tn 

CO 
(O 


76.7 


44.4 


89.9 


63.1 




Identity 

(%) 


24.9 


65.4 


27.0 


27.0 


52.9 


51.8 


32.7 




CO 

rN 


28.9 


34.6 


39.2 


25.8 


30.5 


32.2 


23.6 


79.9 


29.2 


25 

C 

o 

30 ^ 

35 


Homologous gene 


Mycobacterium leprae 
MLCB1788.18 


Corynebacterium sp. ATCC 
31090 


Vibrio parahaemolyticus nutA 


Deinococcus radiodurans 
i DR0505 


Corynebacterium striatum 0RF1 


Xanthomonas campestris 
phaseoli ohr 


Thiobacillus ferrooxidans recG 




Saccharomyces cerevisiae 
S288C YIR019C stal 


Erystpelothrix rhusiopathiae 
ewlA 


Streptococcus pyogenes SF370 
mtsC 


Escherichia coli K12 fecE 


Thermotoga maritima MSB8 
TM0114 


Escherichia coli K12rbsC 


Bacillus subtilis 168 rbsA 


Petromyzon marinus 


Mycobacterium leprae H37R\/ 
RV0009 ppiA 


Ba'cillus subtilis 168 yqgP 


40 


db Match 


00 
OO 

O 
_l 

d. 

Ol 


pir: 140838 


< 

CL 
CQ 

> 

q' 

1— 
lO 

id. 
</) 


c' 

o 

o 
o 
ai 

< 

id. 


prf2513302C 


< 

CO 

in 

CO 
CO 

CN 

t: 

Ol 


sp:RECG_THIFE 




< 
LU 

> 

< 
d. 
<y» 


o' 

in 

CO 
CN 

in 
a: 

LU 

d. 

Ol 


gp:AF180520_3 


sp:FECE_ECOLI 


pir:A72417 j 


prf:1207243B 


sp:RBSA,BACSU 


pir:l51116 1 


sp:CYPA„MYCTU 


in 
o 
< 

CD 

a.' 

o 
>- 

d. 
(/) 




si 


cn 

O) 
CJ) 


o 

CO 


00 
CN 

in 


1236 


in 

CO 


in 

CO 

N- 


CO 


CO 
CO 


1278 


m 


cn 

CO 


in 

CD 


CO 
C51 


1023 


<j) 

in 


CO 
00 


CO 

in 


r-- 
co 
<o 


45 


Terminal 
(nt) 


21065 


21074 


22124 


23399 


23615 


24729 


248S5 


26775 


26822 


28164 


29117 


30651 


31677 


32699 


33457 


33465 


34899 


1 35668 


50 


Initial 
(nt) 


20073 


21253 


21597 


1 — ^ 

22164 


23779 


24295 


26297 


26338 


28099 


29117 


29965 


29995 


30697 


31677 


32699 


34280 


34339 


34982 




CO 2 S 


3523 


3524 


3525 


3526 


3527 


3528 


3529 


3530 


3531 


1 

3532 


3533 


cn 
in 

CO 


3535 


3536 


3537 


3538 


3539 


3540 


55 


O rS 5 
CO 2 Q 


1 " 


CN 


in 

CN 

i 


1 

i " 


h- 

CN 


CO 
CN 


<Ti 
CN 


o 

CO 


CO 


CN 

cn 


CO 
CO 


Tj- 

co 


in 
cn 


CD 
CO 


cn 


OO 
CO 


CJ) 
CO 


o 



37 



EP 1 108 790 A2 



5 
10 


Function 


ferric enterobactin transport system 
permease protein 




ATPase 


vulnibactin utilization protein 


hypothetical membrane protein 


serine/threonine protein kinase 


serineAhreonine protein kinase 


penicillin-binding protein 


stage V sporulation protein E 


phosphoprotein phosphatase 


hypothetical protein 


hypothetical protein 










phenol 2-monooxygenase 


succinate-semialdehyde 
dehydrogenase (NAD(P)+) 


hypothetical protein 


hypothetical membrane protein 


15 


Matched 
length 

(a.a) 


<M 
CO 
CO 




n 
to 


o 
to 

CM 


in 

CJ) 


CO 
CO 


CO 
CO 


CM 


in 

CO 


a> 
to 


in 
in 


to 

CN 

in 










h- 


o 

O) 


CM 
CM 


CN 

to 

CN 


20 


Similarity 
(%) 


70.5 




81.8 


52.7 


72.6 


68.7 


59.1 


66.7 


CO 

in 

CO 


70.8 


in 
<d 

CO 


38.8 










63.3 


78.2 


57.0 


64.1 




Identity 
(%) 


40.4 




51.8 


26.2 


40.0 


40.6 


31.7 


33.5 


31.2 


44.1 


38.7 


23.6 










29.9 


46.7 


27.3 


29.0 


25 

c 
c 

8 

35 


Honnologous gene 


Escherichia coli K12fepG 




Vibrio cholerae viuC 


Vibrio vulnificus M06-24 viuB 


Mycobacterium tuberculosis 
H37RV RvOOIIc 


Mycobacterium leprae pknB 


Streptomyces coelicolor pksC 


Streptomyces griseus pbpA 


Bacillus subtilis 168spoVE 


Mycobacterium tuberculosis 
H37RV ppp 


Mycobacterium tuberculosis 
H37RvRv0019c 


Mycobacterium tuberculosis 
H37RV Rv0020c 










Trichosporon cutaneum ATCC 
46490 


Escherichia coli K12gabD 


Bacillus subtilis yrkH 


Methanococcus jannaschii 
MJ0441 


40 


db Match 


_j 
O 

o 
III 

o' 

CL 
UJ 
U- 

CL 
CO 




gp:VCU52150_9 


ZD 
> 
GO 

> 
> 

tA 


sp:Y011_MYCTU 


UJ 

_i 

o 
> 

a. 

CL 
I/) 


gp:AF094711J 


in 
in 

CM 
U- 

< 

CL 
CO 


sp:SP5E_BACSU 


pir:H70699 


o 
o 

o 

< 

'o. 


pir:B70700 










sp:PH2M_TRICU 


sp:GABD_ECOLI 


W 

o 
< 

CO 

x' 

> 

CL 
<A 


< 

t- 
LU 

:e 

rr 

>■ 

CL 




ORF 
(bp) 


CD 
O 


to 




CM 
CM 
OO 


o 

CM 


CO 

CO 

o> 


o 


CM 
CM 


CO 


CO 

in 

CO 


CN 
CD 


to 

CO 




o 

CM 


CJ) 
CM 




in 
o> 


1470 


1467 


cn 

CO 


45 


Terminal 
(nt) 


38198 


36247 


38978 


39799 


40189 


40576 


42513 


43926 


45347 


46669 


48024 


48505 


49455 


49897 


50754 


50966 


54008 


51626 


55546 


55629 


50 


Initial 
(nt) 


' 37221 


37242 


38202 


38978 


40458 


i 42513 


1 43919 


45347 


46489 


48021 


48485 

i 


49368 


49601 


50616 ! 


50972 


51436 1 


53055 


53095 


54080 


56417 




S 0> 


3541 


CM 

m 

CO 


CO 

m 

CO 


3544 


3545 


3546 


3547 


3548 


3549 


3550 


3551 


3552 


3553; 


3554 


3555 


3556 


3557 


3558 


3559 i 


3560 


55 


SEQ 

NO. 

(DMA) 




CM 


CO 




in 


CO 




GO 


CJ) 


I 


in 


CM 

in 


CO 

in 


in 


m 
in 


CO 

m 




cn 
in 


Oi 
in 


o 
to 
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Function 


hypothetical protein 


hypothetical protein 


hypothetical protein 




hypothetical protein 






magnesium and cobalt transport 
protein 




chloride channel protein 


required for NMN transport 


phosphate starvation-induced 
protein-like protein 








Mg(2+)/citrate complex secondary 
transporier 


two-component system sensor 
histidine kinase 




transcriptional regulator 


D'isomer specific 2-hydroyyacid 
dehydrogenase 




Matched 
length 
(aa) 




O) 


tn 




o 
ro 






o 

CO 




o 
o 


CM 


o 

CO 








0 


CO 

to 
m 




C7) 
CM 
CM 


CO 
Oi 
CM 


Similarity 
(%) 


74.3 


70.4 


83.9 




50.7 






59.5 




64.8 


CO 

in 


o 

CD 








68.8 


60.6 




63,3 


73.7 


Identity 
(%) 


40.5 


36.3 


53.2 




26.8 






29.5 




30.0 


24.1 


29.1 








42.3 


27.2 ' 




33.2 


43.3 


Homologous gene 


Bacillus subtilis yrkF 


Synechocystis sp. PCC6803 
slr1261 


Mycobacterium tuberculosis 
H37Rv Rv1766 




Leishmania major L4768.1 1 






Mycobacterium tuberculosis 
H37RV Rv1239c corA 




Zymomonas mobilis 2M4 cIcb 


Salmonella typhimurium pnuC 


Mycobacterium tuberculosis 
H37RV RV2368C 








Bacillus subtilis citM 


Escherichia coli K12 dpiB 




Escherichia coli K12criR 


Corynebacterium glutamicum 
unkdh 


db Match 


sp;YRKF_BACSU 


sp:YC61_SYNY3 


pir:G70988 




«' 

CO 

_J 

Ul 

_J 
CL 
Oi 






ipir:F70952 




gp:AF179611_12 


> 
o' 

3 

Ql 

V> 


3 
1- 
O 

J 

o 

X 

CL 

b. 

i/> 








1 

sp:CITM_BACSU 


_j 
0 
0 

LU 

m' 

o. 
0 

b. 

CO 




sp:DPIA_ECOLI 


^— 

in' 

CD 
CO 

CO 

< 

b- 
cn 








r — 

cn 

lO 




in 
in 

CO 


O 
CO 




1653 


1119 




1269 


o 
ay 

CD 


1122 


CM 

ro 


CO 
CO 


in 

fS 


1467 


1653 


0 
in 


in 
to 


CM 

5 




Terminal 
(nt) 


56386 


56680 


57651 


58941 


59930 


60662 


62321 


62390 


63594 


65458 ' 


65508 


67972 


68301 ' 


68251 


69824 


68720 


72158 


71474 


72814 


72817 


Initial 
(nt) 


56676 


57270 


57478 


58087 


59091 


59952 


69909 


63508 


64040 


64190 


66197 


66851 


68170 


68634 


69060 


70186 


70506 


72043 


72161 


73728 




w z S 


3561 
3562 


3563 


3564 


3565 


3566 


3567 


3568 


3569 


3570 


3571 


3572 


3573 


3574! 


3575 


3576 


3577 


3578 


3579 


3580 


1 

1 

! 

! 




n 
to 


to 


in 

CO 


to 

CD 


co 


CO 
CD 


cn 

CD 


o 




CM 


CO 


^ 


tn 


to 


11 


CO 

1^ 


o) 1 0 
r*- ! 00 

i 



39 



EP 1 108 790 A2 



















: protein 








information 










: or urease 






Function 


hypothetical protein 


biotin synthase 


hypothetical protein 




hypothetical protein 




hypothetical protein 


hypothetical protein 


integral membrane efflux 


creatinine deaminase 






SIR2 gene family (silent! 
regulator) 


triacylglycerol lipase 


triacylglycerol lipase 




transcriptional regulator 


urease gammma subunit 
structural protein 


urease beta subunit 


urease alpha subunit 


Matched 
length 
(a.a) 


CN 


n 
n 


CO 


OO 




CN 


00 


r-- 
o 
m 


CO 






a> 

CN 


in 

CN 


CN 
CO 
CM 




^ — 


o 
o 


CM 
CD 


o 

m 


milarity 
(%) 


76.4 


99.7 


79.1 


63.5 




75,0 


66.0 


59.0 


99.8 






50.2 


59.0 


56.1 




Oi 


100.0 


100.0 


100.0 


CO 








































Identity 
(%) 


38.6 


ai 


72.1 


34.1 




71.0 


61.0 


25.6 


97.2 






26.2 


30.7 


cn 

CM 




90.6 


100.0 


100.0 


100.0 


Homologous gene 


Streptomyces coelicolor A3(2) 
SCM2.03 


Corynebacterium glutamicum 
bioB 


Mycobacterium tuberculosis 
H37RV Rv1590 


Saccharomyces cerevisiae 
YKL084W 




cn 
o> 

E 
S 

CD 

*o 
E 

OJ 

E ?! 

CO o 

x: a 
CJ 1- 


Chlamydia pneumoniae 


Streptomyces virginiae varS 


Bacillus sp. 






Saccharomyces cerevisiae hst2 


Propionibacterium acnes 


Propionibacterium acnes 




Corynebacterium glutamicum 
ureR 


Corynebacterium glutamicum 
ureA 


Corynebacterium glutamicum 
ATCC 13032 ureB 


Corynebacterium glutamicum 
ATCC 13032 ureC 


db Match 


gp:SCM2_3 


sp:BIOB_CORGL 


CN 

in 
o 

X 

Q. 


sp:YK14_YEAST 




PIR:F81737 


m 

CO 

>- 

CL 
CO 

o 


prf:2512333A 


gp:D38505_1 






sp:HST2_YEAST 


prf2316378A 


< 

00 

r*- 

CO 
CO 

CO 
CN 

t: 

Q_ 




gp:AB029154_1 


gp;AB029154_2 


gp:CGL251883_2 


CO 

1 

CO 
CO 
CO 

in 

CM 

—I 
CD 
O 

CL 

o> 




Ol * 
CN 


1002 


CO 
CM 


o> 
ro 

CO 






CO 
CN 


1449 


1245 


CD 

o 
CO 


in 

CD 


CN 

Oi 


CN 

cn 


O 
O 

cn 


00 
00 
OO 


CO 

in 


o 
o 

CO 


CD 
GO 


1710 


Terminal 
(nt) 


74272 


75491 


75742 


76035 


76469 


80613 


81002 


82120 


83691 


85098 


85663 


87241 


87561 


88545 


90445 


90461 


91473 


91988 


93701 


Initial 
(nt) 


73844 


74490 


75506 


75697 


76353 


80753 

1 


81274 


83568 


84935 


85403 


86277 


86318 


88532 


89444 


89558 


1 

90973 


1 

91174 


91503 


91992 


SEQ 

(a.a.) 


3581 


3582 


j3583 


3584 


'3585 


3586 


3587 


3588 


3589 


3590 


3591 


3592 


3593 1 


3594 


3595 


3596 


3597 


3598 


3599 


SEQ 
NO. 
(DMA) 


T— 

CO 


CN 
CO 


CO 
CO 


1 


LO 
CO 


i 

ID 

1 


oo 


OO 
CD 


cn 

CO 


o 


1 ^ 


CN 

cn 


CO 

cn 


1 

r 


in 


CD 
C3> 


h- 
cn 


00 

cn 


a> 
cn 
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3 



Function 


urease accessory protein 


urease accessory protein 


urease accessory protein 


urease accessory protein 


epoxide hydrolase 




valanimycin resistant protein 






heat shock protein (hsp90-tamily) 


AMP nucleosidase 




1 

acetolactate synthase large subunit 




proline dehydrogenase/P5C 
dehydrogenase 




aryl-alcohol dehydrogenase 
(NADP+) 


pump protein (transport) 


indoIe-3-acetyl-Asp hydrolase 




hypothetical membrane protein 




Matched 
length 
(aa) 


in 


CO 
CM 
CN 


in 
o 

CN 


CO 
GO 
CN 


o> 

CN 




CO 






CO 
CO 
CO 


CO 




<o 

Oi 




1297 




00 
CO 
CO 


CO 

s> 


CN 

in 

CO 




CO 

o 




Similarity 
(%) 


100.0 


100.0 


O'OOl 


100.0 


48.4 




59.7 ! 






52.7 


68.2 




68.7 




50.4 




60.7 


71.4 


49.2 




70.8 




Identity 
(%) 


100.0 


100.0 


100.0 


100.0 


21.2 




26.5 






23.8 


41.0 




29.6 




25.8 




30.2 


36.5 


23.0 




35.9 




Homologous gene 


Corynebacterium glutamicum 
ATCC 13032 ureE 


Corynebacterium glutamicum 
ATCC 13032 ureF 


Coiynebacterium glutamicum 
ATCC 13032 ureG 


Corynebacterium glutamicum 
ATCC 13032 ureD 


Agrobacterium radiobacter ecliA 




Streptomyces viridifaciens vImF 






Escherichia coli K12 htpG 


Escherichia coli K12 amn 




Aeropyrum perniv K1 APE2509 




Salmonella typhimurium putA 




Phanerochaete chrysosporium 
aad 


Escherichia coliK12ydaH 


(/> 
C 

£ 
o 

Ol 
O) 

n> 

O 
t> 

(Q 
JQ 

O 

S 
c 
UJ 




Escherichia coli K12yidH | 




db Match 


r,' 

CO 
00 

m 
—1 

9 


gp:CGL251883_5 


CO 
03 
CO 

in 

CN 

_1 

o 

O 
Id. 
at 


I 

CO 
CO 
00 

in 

CN 

8 


prf:2318326B 




CN 
CN 
CO 

oo 
U- 

< 

icL 






-J 
o 
o 

^. 

o 
a 
1- 

X 

Q- 
to 


-J 
o 
a 

< 

d 

tn 




CO 
CO 

CN 
LU 

LJ 

'q. 




sp;PUTA_SALTY 




sp:AAD^PHACH 


-J 
o 
o 

UJ 

1 

i 

> 

iCL 
Ul 


< 

CN 
CN 
TT 
CN 

t: 
n. 




sp:YIDH_ECOLI 




u 




00 
CO 


in 

CO 


Oi 
CO 


UL 


Oi 
CO 
CO 


1152 


in 

CO 


2775 


1824 


1416 


o> 
m 


CN 
lO 

m 


O 
CO 
CO 


3456 


r— 


in 

CD 


1614 


1332 


a> 
Oi 
CO 


CO 
CO 
CO 


m 

CO 


Terminal 
(nt) 


94199 


94879 


95513 


96365 


96368 


98189 1 


97319 


100493 1 


98806 


101612 


104909 


1 105173 


105841 


106630 


110890 


111274 


112318 


114083 


00 

m 


114564 


115943 


1 16263 


Initial 
(nt) 


93729 


94202 


94699 


95517 i 


97144 


97521 


98470 


99819 


101582 


103435 


103494 


105751 


106392 


107289 


107435 


111161 


111374 


112470 


114147 


115262 


115578 


Oi 
CD 

m 


to 2 S 


3600 


3601 


3602 


3603 


36041 

1 


3605 


3606 


3607 


3608 


3609 


CD 
CO 


3611 


3612 


3613 


3614 


3615 


3616 


3617 


3618 


3619 


3620 


3621 


SEQ 

NO. 

/HM A \ 

(UNA; 


o 
o 


e i 
i 


CN 
O 

T— 

! 


CO 

o 


O ' 


in 
o 


CO 

o 


O 

1 


CO 

o 


Oi 
O 

" \ 


o 




CN 


CO 


^ 1 

1 


LO 


(O 

t — 


1^ 


CO 


cn 


o 

CN 


CN j 
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3 



o 



Function 




transcriptional repressor 


methylglyoxalase 


hypothetical protein 


mannitol dehydrogenase 


D-arabinitol transporter 




galactitol utilization operon repressor 


xylulose kinase 




pantoate-beta-alanine ligase 


3-methyl-2-oxobutanoate 
hydroxymethyltransferase 




DNA-3-methyladenine glycosylase 




esterase 




carbonate dehydratase 


[xylose operon repressor protein 


macrolide efflux protein 






Matched 
length 
(a.a) 




GO 
ID 
CM 


CO 
CM 


CM 
CO 




in 




o 

CO 
CM 


LO 




Ol 
CN 


CN 




188 




o 

CN 




o 

CN 


in 

CO 


00 




















































ni ^ 

■gb 




59.7 


78.6 


64.8 


70.4 


68.3 




64.6 


68.1 




100.0 


100.0 




67.6 




69.3 




53.2 


49.3 


61.2 






CO 














































Is 




29.5 


57.9 


37.0 


43.5 1 


30.3 




CO 
CN 


45.0 




100.0 


100.0 




42.0 




39.3 




30.9 


24.1 


21.1 






Homologous gene 




Agrobacterium tumefaciens 
accR 


Bacillus subtills yurT 


Mycobacterium tuberculosis 
H37RV Rv1276c 


Pseudomonas fluorescens mtlD 


Klebsiella pneumoniae dalT 




Escherichia coli K12 gatR 


Streptomyces rubiginosus xylB 




Corynebacterium glutamicum 
ATCC 13032 pane 


Corynebacterium glutamicum 
ATCC 13032 panB 




Arabidopsis thaliana mag 




Petroleum-degrading bacterium 
HD-1 hde 




Methanosarcina thermophila 


Bacillus subtilisW23 xylR 


Lactococcus lactis mef214 






db Match 




sp:ACCR_AGRTU 


pir:C70019 


1- 
O 

>- 

1 

CO 

CJ 
>- 

Q. 
cn 


prf.2309180A 


prf;2321326A 




sp:GATR_ECOLI 


ZD 
Ql 

a: 
>- 

CO 

—1 
> 

x; 

CL 

tn 




gp:CGPAN_2 


gp;CGPAN_1 




X 

\- 
< 

<. 

O 

ro 

id. 
cn 




gp:AB029896_1 




UJ 

1" 
UJ 

I 

< 
O 

cL 
(/> 


sp;XYLR_BACSU 


gp:LLLPK214J2 






u 


2052 


o 

CO 

r- 


o 

CJ) 

cn 


O 
ID 


1509 


1335 


CD 
CO 


CO 

eo 


1419 


CN 
CM 
00 


CO 

oo 


CO 
GO 


ID 
Oi 


O 
CO 
CO 


LO 
CO 


CM 
O) 


CN 
CO 


00 

in 
in 


1143 


1272 


CD 
OO 




Terminal 
(nt) 


116548 


118810 


120410 


120413 


120951 


122507 


^ 124030 


124966 


126350 


127992 


126353 


127192 


128099 1 


129489 


130798 


130815 


132424 


132981 


132971 


134207 


135518 


136122 


Initial 
(nt) 


118599 


119589 


120021 


120922 


122459 


123841 


123842 


124130 


124932 


127171 


127189 


128004 


129049 


130118 j 


130145 


131738 


131798 


132424 


134113 


135478 


136321 


136565 


SEQ 

Kin 

(a.a.) 


3622 


3623 


3624 


3625 


3626 


3627 


3628 


3629 


3630 


3531 


,3632 


3633 


3634 


3635| 


3636 


3637 


3638 


3639 


o 

CO 
CO 


CO 
CO 


3642 


3643 


LJLJ 9 S 
CO 2 S 


CM 
CM 


CO 
CN 


CM 


ID 

1 ' 


CO 
CN 


CM 


oo 
CN 


<Ji 
CN 


!o 

: n 

1 

1 


CO 


CM 
CO 


CO 

ro 


CO 

1, 


in 

CO 

I 


CO 
CO 


t 

1 CO 


|co 


1 

C7) 
CO 


o 


1 


CN 


CO 

1 
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T3 
C 



O 

o 



Function 

1 








cellulose synthase 


hypothetical membrane protein 








chloramphenicol sensitive protein 


hypothetical membrane protein 






transport protein 


hypothetical membrane protein 






ATP-dependent helicase 




nodulation proiein 


DNA repair system specific for 
alkylated DNA 


DNA-3-methyladenlne glycosylase 


1 threonine efflux protein 


hypothetical protein 


doxorubicin biosynthesis enzyme 


Matched 
length 
(a.a) 








o 


CO 
CD 
CO 








CO 

o 

CO 


GO 
C3> 
1 — 






CD 
CO 


CO 
CN 






cn 

CN 
OO 




OO 
CO 


cn 


CO 
CO 
r~ 


oi 


in 
m 


00 
CN 


Similarity 
(%) 








51.2 


GO 
LO 








50.7 


CJ) 

in 






62.3 


70.2 






64.3 




66.0 


60.7 


65.1 


CO 
CO 


72.7 


52.1 


Identity 
(%) 








24.3 


25.1 








CO 


30.3 






32.4 


34,7 






33.8 




40.4 


34.7 


39.8 


34.1 


60S 


31.0 


Homologous gene 








Agrobacterium tumefaciens celA 


Saccharomyces cerevisiae 
YDR420W hkr1 








Pseudomonas aeruginosa rarD 


Escherichia coli K12 yadS 






Escherichia coli K12 abrB 


Escherichia coli K12 yfcA 






Escherichia coii K12 hrpB 




Rhizobium leguminosarum bv. 
viciae plasmid pRLUI nodL 


Escherichia coli o373#1 alkB 


Escherichia coli K12tag 


Escherichia coli K12rhtC 


Bacillus subtilis yaaA 


Streptomyces peucetius dnrV 


db Match 








a> 

CO 

a. 


sp:HKR1_YEAST 








LU 

< 

LU 

CO 

a. 
o 

< 

on 

Q. 

tn 


_j 
o 
o 

LU 

1 

uy 

Q 
< 
> 

CL 

tn 






sp;ABRB_ECOLI 


sp:YFCA^ECOLI 






-J 
o 

o 

LU 

1 

CD 

a. 
a: 

X 

CL 
CO 




> 

—J 

X 

_.' 

Q 
O 

CL 

tn 


sp:ALKB_ECOLI 


_i 
o 

CJ 
UJ 

^1 

5 

CO 
cn 


sp:RHTC_ECOLI 


sp:YAAA_BACSU 


prf.2510326B 


u 


1941 


1539 


(O 
CO 
(O 


1461 


1731 


CO 


1065 


CO 
CO 


CO 


r*- 
^^ 


CO 
CO 
CO 


1659 


1137 


CO 


CN 
(O 


m 


2388 


m 

CO 


m 
co 


o 
cn 

CD 


in 

CN 

m 


00 
CO 


CD 
CN 


cvi 
in 

CO 


Terminal 
(nt) 


138744 


140329 


139226 


141789 


143526 


143075 


, 144639 


' 145480 


145518 


1 147238 


147570 


149780 


ay 

XT 


; 152369 


150966 


152814 


153226 


156167 


156147 


157537 


158138 


158831 


159159 


160013 


Initial 
(nt) 


136804 


138791 


139861 


140329 


' 141796 


142455 


143575 


144725 


146396 


146522 


147238 


148122 


' 150930 


151572 


151589 


152410 


155613 


155853 


156821 


156848 


157614 


158154 


158869 


159162 


SEQ 

NO. 
(a.a.) 


CO 
CO 


3645 


3646 


3647 


3648 


,3649 


3650 


3651 


3652 


13653 


3654 


3655 


3666 


3657 


3658 


3659 1 


3660 


3661 


3662, 


3663 


3664 


3665 


3666 


3667 


SEQ 
NO. 
(DNA) 


-c- 


ir> 


CO 


j T 


\ 




o 

IT) 


LO 


CvJ 

cn 


CO 

m 


in 

1 


in 
in 


CO 

in 


cn 


00 

in 


O) 

in 


o 

CO 


5 


CN 
CO 


CO 
CO 


•«r ! tr> 

(O 1 CO 

1 




CO 
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3 
C 

c 



Function 


methyttransferase 








ribonuclease 






neprilysin-like metailopeptidase 1 




transcriptional regulator, GntR family 
or fatty acyl-responsive regulator 


fructokinase or carbohydrate kinase 


hypothetical protein 


methylmalonic acid semialdehyde 
dehydrogenase 


myo-inositol catabolism 


myo-inositol catabolism 


rhizopine catabolism protein 


myo-inositol 2-dehydrogenase 


myo-inositol catabolism | 


metabolite export pump of 
tetracenomycin C resistance 




oxidoreductase 






Matched 
length 
(a.a) 


o 








GO 






CM 
CM 

r*- 




00 
CO 
CM 


CM 
CO 
CO 


CO 

<j) 

CM 


oo 

CD 


CO 

to 

CM 


CO 
00 

tn 


o 
o> 

CM 


tn 

CO 
CO 


OO 
CM 


in 




in 

CO 




Similarity 
(%) 


56.7 








76.3 






57.2 




65.6 


63.0 


ci 

CO 


86.1 


58.2 


69.8 


51.0 


72.2 


72.1 


61.5 




65.5 




Identity 
(%) 


35.6 








41.5 






28.5 




29.8 


28.6 


52.7 


61.0 


33.2 


41.0 


o> 

CM 


39.1 


44.6 


30.9 




31.1 




Homologous gene 


Schizosaccharomyces pombe 
SPAC1 250.04c 








Neisseria meningitidis MC58 
NMB0662 






Mus musculus nil 




Escherichia coli K12farR 


Beta vulgaris 


Streptomyces coelicolor A3(2) 
SC8F1 1.03c 


Streptomyces coelicolor msdA 


Bacillus subtilis iolB 


Bacillus subtilis iolD 


Rhizobium meliloti mocC 


Bacillus subtilis idh or iolG 


Bacillus subtilis iolH 


Streptomyces glaucescens tcmA 




Bacillus subtilis yvaA 




db Match 


CO 

o' 

in 

CM 

a 

O) 








gp:AE002420 13 

1 






gp:AF176569_1 




—i 
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Function 




arabinosyl transferase 


hypothetical membrane protein 


acetoacetyt CoA reductase 


oxidoreductase 








proteophosphoglycan 


hypothetical protein 




hypothetical protein 


rhamnosyl transferase 




hypothetical protein 


O-antigen export system ATP- 
binding protein 


O-antigen export system permease 
protein 


hypothetical protein 


NADPH quinone oxidoreductase | 
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Function 




probable electron transfer protein 


amino acid carrier protein | 




molybdopterin biosynthesis protein 
moeB (sulfurylase) 


molybdopterin synthase, large 
subunit 


molybdenum cofactor biosynthesis 
protein CB 


co-factor synthesis protein 


molybdopterin co-factor synthesis 
protein 


hypothetical membrane protein 


molybdate-binding periplasmic 
protein 


molybdopterin converting factor 
subunit 1 


maltose transport protein | 


hypothetical membrane protein 


histidinol-phosphate 
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transcription factor 


alcohol dehydrogenase 


putrescine oxidase 


magnesium ion transporter 




Na/dicarboxylate cotransporter 


oxidoreductase 


hypothetical protein 


nitrogen fixation protein 
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aspartate transaminase 




DNA polymerase III holoenzyme tau 
subunit 




hypothetical protein 


recombination protein 


cobyric acid synthase 


UDP-N-acetylmuramyl tripeptide 
synthetase 


DNA polymerase III epsilon chain 


hypothetical membrane protein 


aspartate kinase alpha chain 






extracytoplasmic function alternative 
Sigma factor 


vegetative catalase 






leucine-responsive regulatory 
protein 
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metalloregulatory protein 


arsenic oxyanion-translocation pump 
membrane subunit 


arsenate reductase 








Na+/H+ antiporter or multiple 
resistance and pH regulation related 
protein D 


Na+/H+ antiporter 
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protein A 
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Bacillus subtilis yqeV 


db Match 






gp:AF178758_1 


in 

CO 
LL. 

< 

CL 
Ol 


sp:ARSC_STAXY 








gp:AF097740_4 


prf:2504285D 


gp:AF097740_1 








ID 
LU 

o 
—1 

< 

! 

DC 
O 
rsi 
O 
cL 

CA 


prf:2214304B 


CL 
< 
d 
tn 




pir:B69865 


Z) 
C/) 

o 
< 

>■' 

LU 
O 
> 

tn 


ORF 
(bp) 


CM 


in 

CO 


in 

CO 


1080 


CO 
CO 


CO 

y— 
CO 


o 

CN 


CO 

m 


1530 


oo 

CO 


2886 


1485 


CO 

o 

CO 


CD 
CO 


CD 
CD 
CD 


1467 


CO 
O 
CO 


t— 
CD 

m 


in 
S> 


CO 

in 


Terminal 
(nt) 


277904 


277987 


278388 


279893 


280279 


280349 


280670 


280949 


281404 


282937 


283317 


287857 


287059 


287966 


289131 


289777 


292417 


291273 


292597 


293991 


Initial 
(nt) 


277581 


278301 


278732 


278814 


279893 


280666 


280939 


281401 


282933 


283317 


286202 


286373 


287661 


o> 

CM 
CO 
QO 
CO 
CM 


289796 


291243 


291815 


291833 


293511 


293539 


CO ^ 


3790 


3791 


3792 


3793 


3794 


3795 


3796 


3797 
3798 


3799 


3800 


3801 


3802 


3803 


3804 


3805 


3806 


3807 


3808 


3809 


SEQ 

NO. 
(DNA) 


o 
ay 

CN 


ay 

CN 


CN 

o> 

CN 


CO 

cn 

CN 


CN 


m 
a> 

CM 


CD 

cy> 

CN 


297 
298 


cn 

O) 
CN 


o 
o 

CO 


o 

CO 


CN 

ico 

i 


CO 

o 

CO 


1 ^ 
o 

CO 


in 
o 

CO 


CD 
O 
CO 


o 

CO 


CO 

o 

CO 


a> 
o 

CO 



50 



EP1 108 790 A2 



3 

s 



Function 


class A penicillin-binding 
Iprotein(PBPI) 


regulatory protein 




hypothetical protein 


transcriptional regulator 


shikimate transport protein 




long-chaln-fatty-acid-CoA ligase 


transcriptional regulator 
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hypothetical protein 


serine proteinase 


epoxide hydrolase 


hypothetical membrane protein 


phosphoserine phosphatase 


hypothetical protein 


conjugal transfer region protein ] 




hypothetical membrane protein 


hypothetical protein 


hypothetical protein 








ATP-dependent RNA helicase | 


cold shock protein 




DNA topoisomerase \ 
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adenylate cyclase 


DNA polymerase III subunit 
tau/gamma 




hypothetical protein 
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ribosomal large subunit 
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beta-glucosidase 
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Homologous gene 


stigmatella aurantiaca B17R20 
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Bacillus subtitis dnaX 
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Deinococcus radiodurans 
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NADP-dependent alcohol 
dehydrogenase 


glucose-1 -phosphate 
thymidylyltransferase 


dTDP-4-keto-L-rhamnose reductase 


dTDP-glucose 4,6-dehydratase 
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pilin glycosytation protein 


capsular polysaccharide 
biosynthesis 


lipopolysaccharide biosynthesis / 
export protein 
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2-oxogli;tarate decarboxylase and 2- 
succinyl-6-hydroxy-2,4- 
cyclohexadiene-1-carboxylate 
synthase 


hypothetical membrane protein 


alpha-D-mannose-alpha(1- 
6)phosphatfdyl myo-lnositol 
monomannoside transferase 


D-serine/D-alanine/glycine 
transporter 


ubiquinone/menaquinone 
biosynthesis methyltransferase 
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component II 


preprotein translocase SecE subunit 
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succinate-semialdehyde 
dehydrogenase (NAD(P)+) 


novel two-component regulatory 
system 


tyrosine-specific transport protein 


cation-transporting ATPase G 


hypothetical protein or 
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30S ribosomal protein 312 


30S ribosomal protein 87 


elongation factor G | 






j lipoprotein 






ferric enterobactin transport ATP- 
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ferric enterobactin transport protein 
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503 ribosomal protein L22 


30S ribosomal protein S3 


SOS ribosomal protein LI 6 


SOS ribosomal protein L29 


30S ribosomal protein S17 








SOS ribosomal protein LI 4 


SOS ribosomal protein L24 


SOS ribosomal protein L5 
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hypothetical protein 


hypothetical protein 


30S ribosomal protein S8 


50S ribosomal protein L6 


505 ribosomal protein LIB 


30S ribosomal protein S5 


50S ribosomal protein L30 


50S ribosomal protein LI 5 




methylmalonic acid semialdehyde 
dehydrogenase 




novel two-component regulatory 
system 


aldehyde dehydrogenase or betaine 
aldehyde dehydrogenase 






reductase 
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hypothetical protein 
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cytochrome P450 


15 


Matched 
length 
(a.a) 


u> 
o 


o 


CN 
CD 


cn 


o 


■r- 


m 


CD 
XT 




oo 

CN 




m 

CN 

T— 


CO 






o 


o 


m 

CN 


o 
in 


CJ) 
CN 
CO 


CO 

CD 


CN 
OJ 


20 


Similarity 


O 

tn 


66.7 


97.7 


87.7 


90.9 


88.3 


76.4 


87.4 1 




68.8 




52.0 


71.5 






71.6 


66.4 


70.8 


56.0 


45.0 


66.7 


65.2 




Identity 
(%) 


24.7 


42.7 


75.8 


59.2 


67.3 


67.8 


CO 

^* 
lf> 


66.4 




46.9 




47.0 


41.7 






41.1 


47.7 


35.8 


1 50.0 


22.9 


38.6 


34.8 


25 ^ 
a> 

3 

30 03 
XI 


Honnologous gene 


Archaeoglobus fulgidus AF1 398 


[Deinococcus radiodurans 
DR0763 


Micrococcus luteus 


Micrococcus luteus 


Micrococcus luteus rpIR 


Micrococcus luteus rpsE 


Escherichia coli K12 rpmJ 


Micrococcus luteus rpIO 




Streptomyces coelicolor msdA 




Azospirillum brasilense carR 


Rhodococcus rhodochrous 
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transcriptional repressor 


1 adenylate kinase 




methionine aminopeptldase 1 




translation initiation factor IF-1 


SOS ribosomal protein S13 


30S ribosomal protein S1 1 


305 ribosomal protein S4 


RNA polymerase alpha subunit 




50S ribosomal protein LI 7 
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Function 


high-alkaline serine proteinase 


hypothetical membrane protein 


hypothetical membrane protein 








hypothetical protein 


early secretory antigen target ESAT- 
6 protein 


50S ribosomal protein LI 3 


30S ribosomal protein S9 


phosphoglucosamine mutase 




hypothetical protein 
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Table 1 (continue 


Homologous gene 


Baclltus alcalophilus 


Mycobacterium tuberculosi 
H37RV Rv3447c 
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Function 


IMP dehydrogenase 


hypothetical membrane protein 


glutamate synthetase positive 
regulator 


1 

GMP synthetase 








hypothetical membrane protein 


two-component system sensor 
histidine kinase 


transcriptional regulator or 
extracellular proteinase response 
regulator 








hypothetical protein 


hypothetical protein 




hypothetical protein 


hypothetical membrane protein 
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Table 1 (continued) 


Homologous gene 


Corynebacterium 
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Escherichia coli K12ybiF 


Bacillus subtilis gItC 


Corynebacterium 
ammoniagenes guaA 








Streptomyces coelicolor A3(2) 


Streptomyces coelicolor A3(2) 
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Bacillus subtilis 168degU 








Mycobacterium tuberculosis 
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hypothetical membrane prot 


phytoene desaturase 


phytoene synthase 


transmembrane transport pi 


transcriptional regulator (Ma 
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outer membrane lipoprotein 


hypothetical protein 


DNA photolyase 
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Function 


hypothetical membrane protein 




transcriptional repressor 


hypothetical protein 




transcriptional regulator {Sir2 family) 


hypothetical protein 


iron-regulated lipoprotein precursor 


rRNA methylase 


methylenetelrahydrofolate 
dehydrogenase 


hypothetlcal membrane protein 


hypothetical protein 




homoserine O-acetyltransferase 


O-acetylhomoserine sulfhydrylase 


carbon starvation protein 
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Homologous gene 


Streptomyces coelicolor A3(2) 
SCE9.01 




Mycobacterium tuberculosis 
H37RV RV2788 SirR 


streptomyces coelicolor A3(2) 
SCG8A.05C 




Archaeoglobus fuigtdus AF1676 


Streptomyces coelicolor A3(2) 
SC5H1.34 i 


Corynebacterium diphtheriae ' 
irpi 


Mycobacterium tuberculosis 
H37Rv Rv3366 spoU 


Mycobacterium tuberculosis 
H37Rv Rv3356c folD 


— " 1 

Mycobacterium leprae 
MLCB1779.16C 


streptomyces coelicolor A3(2) 
SC66T3.18C 




Corynebacterium glutamicum 
metA j 


Leptospira meyeri metY 


Escherichia coli K12cstA 




Eiicherichia coli K12yjiX 




db Match 


^1 
o> 
LU 
U 
O) 




C70884 


SCG8A_5 




C69459 


SC5H1_34 


5 

CN 
O 

3 
Q 
U 


E70971 


C70970 


oo 

CJ>' 

(J 
-J 


:SC66T3J8 




^1 

CN 

in 

CO 
OJ 

in 
o 

LL 
< 


:2317335A 


CSTA.ECOLI 




_j 
o 
a 

UJ 

1 

X 
-o 
>- 








gp: 




pir: 


cn 




a. 


CL 

o> 


CL 
OI 


pir 


Ou 


CL 

o> 


CL 
Oi 




CL 
C7} 


XZ 
CL 


sp: 




a. 
tn 






ORF 

(Dp) 


1413 


00 

cn 


o> 

CD 
CD 


00 


co 
cn 




CM 
CJ) 


CO 
<Ji 
<TJ 


^— 


CM 

in 

CO 


in 
in 

CM 


1380 


co 

CO 
CJ) 


1131 


1311 


2202 


o 

CO 


O 

CM 


CD 

o 
to 


45 


Terminal 
(nt) 


656534 


655097 


657215 


657205 


658142 


658928 


659424 


660538 


660650 


662017 


662374 


662382 


664126 


665183 


666460 


670465 


669445 


670672 


671045 


50 


Initial 
(nt) 


655122 


655834 


656547 


658002 


658005 


658155 


658933 


659543 


661120 


661166 


1 

662120 


663761 


665088 


666313 


667770 


668264 


670053 


670472 


671653 




SEQ 
NO. 
fa.a.) 


4212 


4213 


CN 


4215 


CO 
CM 


4217 


oo 

Cv» 


4219 


4220 


4221 


4222 


4223 


4224 


4225 


42261 


4227^ 


4228 


4229 


4230 


55 


SEQ 
NO. 
(DNA) 


<N 


CO 




tn 


to 




i ^ 
1 ^ 


o> 

1^ 


o 

CN 


CM 


CM 
OI 


fO 
CM 


OJ 


1 u, 

1 

1 ^ 


CD 
CM 


OJ 


00 
1 OI 

1 


a> 

CM 


o 

CO 
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5 
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Function 


hypothetical protein 


carboxy phosphoenolpyruvate 
mutase 


citrate synthase 




hypothetical protein 




L-malate dehydrogenase 


regulatory protein 




vibriobactin utilization protein 


ABC transporter ATP-binding protein 


ABC transporter 


ABC transporter 


iron-regulated lipoprotein precursor 


chloramphenicol resistance protein 


catabolite repression control protein 


hypothetical protein 




15 


Matched 
length 
(aa) 




CO 
CN 


O 
00 
CO 




CO 

m 




00 
CO 
CO 


CO 
CM 
CN 




CO 

CM 


Of) 
CD 
CM 


O) 
CO 
CO 


o 

CO 
CO 


CD 

in 

CO 


vn 

O) 
CO 


CO 

o 

CO 


CD 
CN 




20 


Similarity 
(%) 


86.4 


76.2 


81.3 




62.3 




67.5 


62.8 




54.2 


85.1 


86.4 


88.2 


82.3 


69.6 


58.1 


85.8 






Identity 
(%) 


71.0 


CO 


56.1 




34.0 




37.6 


26.1 




25.4 


55,4 


CO 
CD 

m 


63.0 


53.1 


32.2 


30.4 


56.2 




Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv1130 


streptomyces hygroscopicus 


Mycobacterium smegmatis 
ATCC 607 gItA 




Escherichia coli K12 yneC 




Methanothermus fervidus V24S 
mdh 


Bacillus stearothermophilus T-6 
ux'uR 




Vibrio cholerae OGAWA 3g5 
viuB 


Corynebacterium diphtheriae 
irpID 


Corynebacterium diphtheriae 
irpIC 


Corynebacterium diphtheriae 
irplB 


Corynebacterium diphtheriae 
irpi 


streptomyces venezuelae cmlv 


Pseudomonas aeruginosa crc 


Haemophilus influenzae Rd 
HI1240 




40 


db Match 


pir;C7053g 


prf.ig02224A 


O 

>* 
(/3 

O 

CL 
t/i 




sp:YNEC_ECOLI 




UJ 

u_ 
t- 
m 

I 

X 

a 
:g 

a. 
</) 


prf:2514353L 




o 

CD 

> 

I 

CD 
> 


gp;AF176g02_3 


gp:AF176902_2 


gp:AF176g02,1 


^' 
Id 

CN 
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3 
Q 
O 

d. 


prf:2202262A 


prf:2222220B 


sp:YICG_HAEIN 






ORF 




CN 

T— 

C3> 
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ro 
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CM 
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CM 
1^ 
CO 


1041 


o 

CM 


CM 
O 


CD 
00 


o 

CO 


1059 


CO 
CJ> 

cn 


1050 


1272 


CN 
CD 


in 

CO 


195 


45 


Terminal 
(nt) 


672653 


673576 


674756 


672710 


6747gg 


675846 


675082 


676218 


677047 


680131 


681040 


661846 


682871 


683876 


686380 


687346 


688007 


688335 


50 


Initial 
(nt) 


671700 


672665 


673608 


673639 . 


674ggo 


675175 


676122 


676937 


677748 


681027 


681846 


682904 


683866 


684925 


685109 


686435 


687351 


688141 




SEQ 

NO. 
(aa.) 


4231 


4232 


CO 
CO 
CN 


CO 
CN 


4235 


4236 


4237 


CO 
CO 
CM 


14239 


o 

CM 


4241 


4242 


CO 
CN 


<N 


4245 


4246 


4247 


CO 

CN 
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SEQ 

NO. 
(DNA) 


CO 

p- 


CM 

ro 


CO 
CO 


CO 


CO 


CD 
CO 


ro 


CO 
CO 


O) 

ro 


O 




CM 

i 


CO 






to 




00 



73 



EP 1 108 



790 A2 



Function 




ferrichrome ABC transporter 


hemin permease 


tryptophanyl-tRNA synthetase 


hypothetical protein 




penicillin-binding protein 6B 
precursor 


hypothetical protein 


hypothetical protein 






uracil phosphoribosyltransferase 


bacterial regulatory protein, lad 
family 


N-acyl-L-amino acid amidohydrolase 
or peptidase 


phosphomannomutase 


dihydrolipoamide dehydrogenase 


pyruvate carboxylase 


hypothetical protein 


hypothetical protein 


Matched 
length 
(a.a) 






CO 
CO 


CO 
CO 


CO 
CN 




5 

CO 




CO 
CN 
CO 






cn 
o 

CN 




in 

00 
CO 


5 

m 


oo 

CO 


1140 


CO 
CD 
CsJ 


CM 


Similarity 
(%) 




73.8 


>r— 

cr3 

CO 


79.8 


72.3 




57.5 


70.7 


52.6 






72,3 


66.2 


80.5 


53.8 


o 

ifi 
to 


100.0 


60.1 


66.9 


Identity 
(%) 




in 


38.7 


54.4 


37.1 




30.9 


34.1 


29.4 








41.6 


51.4 


22.1 


31.6 


100.0 


26.2 


30.7 


Homologous gene 




Corynebacterium diphtheriae 
hmuV 


Yersinia enterocolitica hemU 


Escherichia coli K12trpS 


Escherichia coli K12 yhjD 




Salmonella typhimurium LT2 
dacD 


Mycobacterium tuberculosis 
IH37RV Rv3311 


Streptomyces coelicolor A3(2) 
|SC6G1 0.08c 






Lactococcus lactis upp 


Streptomyces coelicolor A3(2) 
SC1A2.11 


Mycobacterium tuberculosis 
H37RV Rv3305c amiA 


Mycoplasma pirum BER manB 


Halobacterium volcanii ATCC 
29605 Ipd 


Corynebacterium glutamicum 
strain21253 pyc 


Mycobacterium tuberculosis 
H37RV Rv1324 


Streptomyces coelicolor A3(2) 
SCF11.30 


db Match 




o 

1 

CN 

<o 

CD 
O 

UL 
< 
Cl 
Ol 


pir:S54438 


_i 
o 
o 

UJ 

^1 

CO 

a. 

</> 


sp;YHJD_ECOL! 




sp:DACD^SALTY 


pir:F70842 


gp:SC6G10J 






O 

5, 

OL 
QL 

CL 
CO 


gp:SC1A2_11 


pir:H70841 


CL 

o 

( 

CD 
Z 
< 

CL 

tn 


sp:DLDH_HALVO 


prf:2415454A 


sp:YD24_MYCTU 


gp:SCF11_30 


ORF 
(bp) 


in 
cn 


o 

GO 


1017 


1035 


1083 


co 
o 
cn 


1137 


1227 


CO 

in 

CO 


in 
o> 


T — 
in 

CO 


CO 
CO 

to 


CO 


1182 


1725' 


1407 


3420 


o 
r-- 

ao 


CO 


Terminal 
(nt) 


688916 


689917 


690706 


692916 


694110 


695074 


695077 


696769 


698065 


699266 


698922 


699913 


700361 


703262 


700364 


704811 ' 


708630 


709708 


710278 


Initial 
(nt) 


689890 


690696 


691722 


691682 


693028 


694172 


695213 


1 697995 


698922 


1 699072 


699272 


699281 


699998 


702081 


702108 


703405 ' 


705211 


708839 


709793 


CO ^ -S. 


cn 


4250 


4251 


4252 


4253 


4254 


4255 
4256 


4257 


00 

in 

CM 


4259 


4260 


4261 


CM 
CO 
CM 
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4264 


4265 
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4267 
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ON 
DBS 


o 


o 

m 


in 


CM 

in 


CO 
IT) 


^ 

in 


in to 
in i in 
r-- 1 

1 


in 


in 
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~760 


CO 

1^ 


CM 
CD 


CO 
CD 


<o 


in 

CO 


CD 
CD 
1^ 
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Function 


hypothetical protein 


thioredoxin reductase 


PrpD protein for propionate 
catabolism 


carboxy phosphoenolpyruvate 
mutase 


hypothetical protein 


citrate synthase 




hypothetical protein 






thiosulfate sulfurtransferase 


hypothetical protein 


hypothetical protein 


hypothetical membrane protein 


hypothetical protein 


hypothetical protein 


detergent sensitivity rescuer or 
carboxyl transferase 


detergent sensitivity rescuer or 
carboxyl transferase 


15 


Matched 
length 
(a.a) 


S 

CO 


LO 
O 
CO 


ir> 


CO 

r-- 

CN 


<o 

O) 


CO 
oo 
CO 




CO 

in 






to 

CN 
CN 


CN 

in 
<o 


CO 
CO 


GO 


CN 


CO 
CO 


ro 
in 


CO 










































20 


Similari 
{%) 


69.0 


59.3 


49.5 


74.5 


47.0 


78.9 




72.6 






100.0 


CO 

o> 


76.7 


63.4 


66.2 


69.6 


100.0 


100.0 




Identity 
(%) 


44.6 


24.6 


24.0 


42.5 


39.0 


54.6 




40.8 






100.0 


61.1 


51.1 


35.1 


1 31.8 


33.3 


oo 

O) 

a? 


99.6 


25 ^ 
■a 

.E 
c 
o 

35 


Homologous gene 


Bacillus subtilis 168 yciC 


Bacillus subtilis !S58 trxB 


Salmonella typhimurium LT2 
prpD 


Sireptomyces hygroscopicus 


Aeropyrum pernix K1 APE0223 


Mycobacterium smegmatis 
ATCC 607 gItA 




Mycobacterium tuberculosis 
H37RV Rv1129c 






Corynebactenum glutamicum 
ATCC 13032 thtR 


Campylobacter jejuni Cj0069 


Mycobacterium leprae 
MLCB4.27C 


Mycobacterium tuberculosis 
H37RV Rv1565c 


Escherichia coli K12 yceF 


Mycobacterium leprae B1308- 
C3-211 


Corynebacterium glutamicum 
AJ11060 dtsR2 


Corynebacterium glutamicum 
AJ 11060 dtsRI 
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db Match 


pir:B69760 


CQ 

m' 

X 

h- 

Q. 
cn 


sp:PRPD_SALTY 


prf: 1902224 A 


PIR:E72779 


w 

CJ 

CO 

u 

Q_ 
tn 




pir:B70539 






spJHTR^CORGL 


gp:CJ11168X1_62 


CO 
CD 

o 

a> 


pirG70539 


sp:YCEF_ECOLI 


prf.2323363CF 


igp:AB018531_2 


pir:JC4991 




u 


1086 


CN 

cn 


ay 
-<r 


888 


00 

cn 


1182 


CO 


1323 


(O 
CN 


1359 


CO 

o 

CD 


1065 




2148 


S 

lO 


(D 
■V 
CN 


1611 


1629 


45 


Terminal 
(nt) 


710520 ! 


712647 j 


714231 


715145 i 


714380 


716283 


716286 


716687 


718350 


720016 


720547 


722841 


722925 


725559 


725872 


726470 


726742 


728696 


50 


Initial 
(nt) 


711605 


711724 


712738 


714258 


714757 


715102 


716660 


718009 


718105 


718658 


721449 


' 721777 


723338 


723412 


726462 


726715 


728352 


730324 




SEQ, 
NO. ! 


4268; 


4269 


4270; 


4271 


4272 


4273 


4274 


LO 
CN 


4276 


4277 


4278 


4279 


4280 


4281 


4282 


4283 


4284 


4285 


55 


SEQ ! 

NO. : 

(DNA) 


CO 
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<o 


1 

o 
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CO 




LO 


CD 


Ul 


CO 


Oi 


o 

CO 


CO 


CN 
CO 


to 

CO 


oo 


lO 
CO 
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Function 


bifundional protein (biotin synthesis 
repressor and biotin acetyl-CoA 
carboxylase tigase) 


hypothetical membrane protein 


5'-phosphoribosyl-5-amino-4- 
imidasol carboxylase 


K+-uptake protein 






5*-phosphoribosyl-5-amino-4- 
imidasol carboxylase 


hypothetical protein 


hypothetical protein 


nitrilotriacetate monooxygenase 


in 

th 

CO 

o> 

(0 

tn 
(Q 

(O 
O 
CL 
(/) 
C 
CO 


glucose 1 -dehydrogenase 


hypothetical membrane protein 




hypothetical protein 


hypothetical protein 




Matched 
length 
(aa) 


CO 
O) 
(N 


in 

<D 


Oi 
CO 


CO 
CN 
CD 






•V 
t — 


CN 

in 


in 
to 

CN 


CO 
CN 


CO 

o 

CO 


CO 

in 

CN 


CO 
O) 




in 


CN 




milarity 
{%) 




00 

Jo 


58.8 


83.8 


73.6 






93.2 


60.5 


70.6 


73.0 


52.5 


64.8 


68.8 




66.3 


76.8 




CO 






































Identity 
(%) 


28.7 


23.0 


69.0 


41.1 






85.7 


36.2 


42.8 


43.2 


23.4 


31.3 


29.2 




28,6 


35.9 




Homologous gene 


Escherichia coli K12birA 


Mycobacterium tuberculosis 
H37RV Rv3278c 


Corynebacterium 
ammoniagenes ATCC 6872 
purK 


1 Escherichia coli K12 kup 






Corynebacterium 
ammoniagenes ATCC 6872 
purE 


Actinosynnema pretiosum 


Streptomyces coelicolor A3(2) 
SCF43A.36 


Ghelatobacter heintzii ATCC 
29600 ntaA 


Archaeoglobus fulgidus 


Bacillus megaterium lAM 1030 
gdhll 


Thenmotoga maritima MSB8 
TM1408 




Bacillus subttlis 168ywjB 


Streptomyces coelicolor A3(2) 
SCJ9A.21 




db Match 


sp:BIRA_ECOLI 


cn 
o 

e) 

"Ol 


< 

o 
o 

^' 

tx 

Z) 
CL 

d. 

<n 


sp:KUP ECOLI 






< 
cr 
o 
o 

co' 
tr 

CL 
CL 

%n 


gp:APU33059_5 


gp:SCF43A_36 


sp:NTAA_CHEHE 


CD 
CN 

rr 

'a. 


UJ 

O 
< 
m 

cn' 
O 
X 

o 

vt 


pir:A72258 




sp:YWJB_BACSU 


5' 

o 

CO 

CL 




ORF 
(bp) 


(D 
CO 


CO 
CO 
TT 


1161 


1872 


in 
5 


in 

CO 


in 


CO 

in 


CN 


1314 


1500 


o> 

OO 


<j> 

CO 
CO 


CN 
CO 


CO 

in 


o 

CN 


CN 
CM 
CN 


Terminal 
(nt) 


731299 


731797 


733017 


i 734943 


733183 


735340 


735896 


736351 


737204 


737216 


738673 


740228 


741765 


742195 


741818 


742828 


742831 


Initial 
(nt) 


730436 


731312 


731857 


733072 


733797 


734984 


735402 


735899 ' 


736413 


738529 


740172 


741016 


741397 


in 

CO 

■a- 


742384 


Ol 

o 

CN 


743052 


CO ^ 5 


4286 


4287 


4288 


4289 


o 
cn 

CN 


4291 


4292 


4293 


4294 


4295 


4296 


4297 


4298 


a> 

CT» 
fN 


4300 


4301 


CM 

o 


rS i 
1 CO 2 S 




CD 
CO 


1 

OO 
1^ 


OO 
OO 


cn 
oo 


O 
CD 


<jy 


CN 
CD 


CO 

O) 


Oi 


in 


to 
o> 


Ol 
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a> 


CD 

cn 


! 

1 CD 
00 


o 

CO 

1 


CN 
O 
OO 
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Function 


trehalose/mattose-binding protein 


trehalose/maltose-binding protein 




trehalose/mattose-binding protein 




ABC transporter ATP-binding protein 
(ABC-type sugar transport protein) 
or cellobiose/maltose transport 
protein 




RNA helicase 






hypothetical protein 


hypothetical protein 


pNA heljcase II 










RNA helicase 


hypothetical protein 


RNA polymerase associated protein 
(ATP-dependent helicase) 


Matched 
length 
(a.a) 


T- 

CM 


CO 

o 

CO 




1 — 




CM 
CO 
CO 




1783 






o 

CM 


O 
CM 


o 










2033 


oo 

CD 
CO 


CO 
CO 


Similarity 
(%) 


75.3 


CO 

o 




62.4 




73.9 




CD 

oi 






59.2 


62.5 


? 










45.8 


53.2 


4B.6 


Identity 
(%) 


42.4 


37.3 




30.9 




57.2 




25.1 






31.7 


30.0 


20.7 










22.4 


24.4 


23.1 


Homologous gene 


Thermococcus litoralts malG 


Thermococcus litoralis malF 




Thermococcus litoralis malE 




Streptomyces reticuli msiK 




Deinococcus radiodurans R1 
DRB0135 






Mycobacterium tuberculosis 
H37RV Rv3268 


Helicobacter pylori J99 jhp0462 ' 


Escherichia coli K12 uvrD 










S'ireptomyces coelicolor 
SCH5,13 


Halobacterium sp. NRC-1 
plasmid pNRC100H1130 


Escherichia coli K12 hepA 


db Match 


o 

U-) 

in 

CO 
(D 

o 

CNJ 

t: 

d. 


prf:2406355B 




< 

LO 
LO 
CO 

to 

O 

CM 
*^ 
O. 




prf:2308356A 




plr:B75633 






pirE70978 


O) 
CM 

cn 

r— 

o 

L_* 


sp:UVRD_ECOLI 










CO 
CO 
CO 

1- 
S. 


pir.T08313 


sp:HEPA_ECOLI 


ORF 
(bp) 


-c- 
co 

00 


1032 


00 
CO 


1272 


CO 
CM 


to 

Oi 


a> 

CO 
CO 


4800 


CM 
CO 


3699 


CO 
CO 
CO 


24331 


1563 


in 

CO 


CO 

cn 

CO 


to 

CD 
CO 


in 

CM 

oo 


6207 


4596 


2886 


Terminal 
(nt) 


743067 


743900 


745046 


745622 


748442 


747031 


748814 


748886 

i 


757434 1 


753697 


1 

757630 


758364 1 


760906 


762853 


763122 


762582 


767367 


763237 


769547 


774150 


Initial 
(nt) 


743900 


744931 


745513 


746893 


748020 


748026 


CO 

GO 


753685 


757063 


757395 


758262 


760796 


762468 


762497 


762730 


762977 


768191 


769443 


774142 


777035 


So s 

c/j 2 3 


CO 

o 

CO 


o 

CO 


4305 


4306 


4307 


4308 


4309 


4310 

i 


4311 


4312 


4313 


CO 


4315 


4316 


4317 


4318 


CO 


4320 


4321 


4322 


§ S 
CO Z a 


CO 

o 

CO 


o 

CO 


. o 

1 


CO 

o 

oo 


o 

CO 


CO 

o 

00 


cn 
o 

oo 


o 

CO 


oo 


CM 
OO 


CO 

oo 


|5 

i 


ID 

CO 


CO 
CO 


1^ 
^— 

I CD 

I 


CO 

T — 

oo 


GO 


o 

CM 
00 


CM 
oo 


CM 
CM 
CO 
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Function 


hypothetical protein 


dTDP-Rha:a-D-GlcNAc- 
diphosphoryl polyprenol, a-3-L- 
rhamnosyl transferase 


mannose-1-phosphate 
guanylyltransferase 


regulatory protein 


hypothetical protein 


hypothetical protein 


phosphomannomutase 


hypothetical protein 


mannose-6-phosphate isomerase 






pheromone-responsive protein 




S-adenosyl-L-homocysteine 
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regulatory protein 


hypothetical protein 


hypothetical protein 


DEAD box ATP-dependent RNA 
helicase 




hypothetical protein 
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ATP-dependent DNA helicase 
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potassium channel 
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hypothetical protein 


1 hypothetical protein 
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regulatory protein 
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myo-inositol monophosphatase 


peptide chain release factor 2 


cell division ATP-binding protein 


hypothetical protein 


cell division protein. 


small protein B (SSRA-binding 
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Table 1 (continued) 


Homologous gene 


streptomyces flavopersicus 
jspcA 


Streptomyces coelicolor A3(2) 
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Mycobacterium tuberculosis 
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hypothetical protein 


phosphoserine transaminase 


acetyl-coenzyme A carboxylase 
carboxy transferase subunit beta 


hypothetical protein 


sodium/protine symporter 




[hypothetical protein 


fatty-acid synthase 
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hypothetical protein 


alkaline phosphatase 


Integral membrane transporter 




glucose-6-phosphale isomease 


hypothetical protein 
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ATP-dependent helicase 


ABC transporter 
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repressor of the high-affinity (methyl) 
ammonium uptake system 


hypothetical protein 




308 ribosomal protein 818 


308 ribosomal protein 814 


50S ribosomal protein L33 


SOS ribosomal protein L28 


transporter (sulfate transporter) 


Zn/Co transport repressor 
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UTP-g!ucose-1 -phosphate 
uridylyltransferase 


molybdopterin biosynthesis protein 


ribosomal-protein-alanine N- 
acetyltransferase 


hypothetical mert^brane protein 


cyanate transport protein 




hypothetical membrane protein 


hypothetical membrane protein 


cyclomaltodextrlnase 


hypothetical membrane protein 


hypothetical protein 


methionyl-lRNA synthetase 


ATP-dependent DNA helicase 
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transposase 


transposase subunit 




D-lactate dehydrogenase 


site-specific DNA-methyltransferase 




transposase 
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hypothetical protein 


regulator 
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S-adenosylmethionine:2- 

demethylmenaquinone 

methyltransferase 




hypothetical protein 


hypothetical protein 




peptide-chain-release factor 3 


amide-urea transport protein 


15 


Matched 
length 
(aa) 


o 


CO 

CN 


CO 
CN 


CO 
CO 








o 


o 
o 


CN 
O 
00 


in 




CN 


CN 
OO 




CD 

in 


o 


20 


Similarity 
(%) 


69.2 


88.1 


oi 
tn 


O) 

o 








56.8 


70.0 


70.0 


GO 

ih 




63.6 


CO 

c6 




68.0 


72.8 




Identity 
(%) 


35.5 


64.8 


27.2 


35.6 








27.7 


o 


42.6 


38.2 




29.8 


24.9 




39.2 


42.8 


Table 1 (continued) 


Homologous gene 


Streptomyces coelicolor A3(2) 
SCF1.02 


Streptomyces coelicolor A3(2) 
SCJ1.15 


Bacillus subtilis 168 yxeH 


Mycobacterium tuberculosis 
H37RV echA9 








Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 csp1 


Streptomyces coelicolor A3(2) 
SCF56.06 


Streptomyces coelicolor A3(2) 
SCE87.17C 


Haemophilus influenzae Rd 
HI0508 menG 




Neisseria meningitidis NMA1953 


Mycobacterium tuberculosis 
H37RV Rv1128c 




Escherichia coil K12 prfC 


Methylophilus methylotrophus 
fmdD 


40 


db Match 


CN 
^1 

LL 
O 
C/) 

cL 

CD 


gp:SCJ1J5 


CO 

o 
< 

CQ 

i' 

UJ 
X 

> 

a. 
tn 


pir:E70893 








sp:CSP1_C0RGL 


CO 

tn 

LL 
O 
O) 

d. 

C7) 


gp:SCE87J7 


UJ 

< 

UJ 

2 

Q. 
CO 




gp:NMA6Z2491 21 
4 


pirA70539 




pir: 159305 


prf:2406311A 




gi 


r- 

CN 

m 


o 

CO 
05 


CN 
CJ) 


1017 


in 

CO 




1212 


1366 


o> 

LO 


2373 


CO 
O 


CD 
CD 
CD 


OO 
CO 


1551 


CO 
CO 

cn 


1647 


1269 


45 


Terminal 
(nt) 


970738 


971823 


972244 


974155 


973304 


974962 j 


974965 


977734 


977800 


978368 


981490 


982287 


982294 


984650 


985845 


984864 


988007 


50 


Initial 
(nt) 


970418 


970864 


973035 


973139 


973957 


974186 


976176 


976349 


978378 


980740 


980993 


981622 


CO 
CN 
CD 
<Ji 


983100 


, 984910 


986510 


986739 




SEQ 
NO. 
(aa.) 


4521 


4522 


4523 


4524 


4525 


45261 


4527! 


4528 


4529 


4530 


CO 

in 


4532 


CO 
CO 

tn 


4534 

i 


14535 


4536 


4537 


55 


SEQ 
NO. 
(DMA) 


CN 
O 


1022 


1023 


rr 

CN 
O 


in 

CM 

o 


CD 
CN} 
O 


r- 

CN 
O 


CO 
CM 
O 

I 


O) 
CN 
O 


O 
CO 

o 


CO 

o 


CN 
CO 
O 


CO 
CD 


1034 


in 

CO 

o 


CO 
CO 

o 


CO 

o 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



Function 


amide-urea transport protein 


amide-urea transport protein 


high-affinity. branched-chain amino 
acid transport ATP-binding protein 


high-affinity branched-chain amino 
acid transport ATP-binding protein 


peptidyl-tRNA hydrolase 


2-nitropropane dioxygenase 


glyceraldehyde-3-phosphate 
dehydrogenase 


polypeptides predicted to be useful 
antigens for vaccines and 
diagnostics . 


peptidyl-tRNA hydrolase 


SOS ribosomal protein L25 


lactoylglutathione lyase 


DNA alkylation repair enzyme 


ribose-phosphate 
pyrophosphokinase 


UDP-N-acetylglucosamine 
pyrophosphorylase 




sufl protein precursor 


nodulation ATP-binding protein 1 


Matched 
; length 
! (aa) 


a 


CO 
CN 


ro 
in 

CN 


CO 
CO 
CN 


00 


CO 
CO 


CM 
CO 


5> 






CO 


00 

o 

CM 


CO 
CO 


CM 

m 




CO 

o 
in 


o 

CO 


Sinnilarity 
{%) 


61.0 


68.0 


70.0 


69.1 


70.6 


54.0 


72.8 


61.0 


63.2 


65.0 


54.6 


62.5 


79.1 


71.9 




61.7 


64.8 


identity 


40.8 


34.6 


37.9 


35.2 


39.0 


25.2 


39.5 


o 
in 


38.5 


47.0 


28.7 


38.9 


o 


42.0 




30.8 


35.8 


Homologous gene 


Methylophilus methylotrophus 
fmdE 


Methylophilus methylotrophus 
fmdF 


Pseudomonas aeruginosa PAO 
braF 


Pseudomonas aeruginosa PAO 
braG 


Escherichia coli K12 pth 


Williopsis mrakii IFO 0895 


Streptomyces roseofulvus gap 


Neisseria meningitidis 


Escherichia coli K12 pth 


Mycobacterium tuberculosis 
H37RvrplY 


Salmonella typhimurium D21 
gloA 


Bncillus cereus ATCC 10987 
alkD 


Bacillus subtilis prs 


Bacillus subtilis gcaD 




Escherichia coli K12 sufl 


Rhizobium sp. N33 nodi 


db Match 


prf: 240631 IB 


prf:2406311C 


sp:BRAF_PSEAE 


sp:BRAG_PSEAE 


sp:PTH_ECOLI 


IX 

I' 

CL 
CN 

*cL 


o 

M 

I 

Q_ 

CO 

(D 
cL 

CO 


GSP:Y75094 


sp:PTH_ECOLI 


pir.B70622 


> 

< 
J 

CD 
-J 

id. 

10 


prf:2516401BW 


sp:KPRS_BACCL 


pir:S66080 




_j 
O 
o 

UJ 

_l 
u. 
3 
O) 

CL 
CA 


CO 

ay 

X 

ce 

1 

Q 

O 

Ql 


si 


CN 
OO 

oo 


1077 

1 


CO 
CM 
1^ 


o> 

CO 


CN 
CO 


1023 


1065 


CD 
CO 
CO 


tn 


o 
o 

CD 


cn 

CN 


CN 
CO 


in 

CJJ 


1455 


1227 


1533 


OO 


Terminal 
(nt) 


988904 


989980 


990705 


991414 


991417 


993080 


994613 


994106 


994845 


995527 


996830 


996833 


997466 


998455 


1000016 


1002864 


1003930 


Initial 
(nt) 


988023 


988904 


086686 


990716 

i 


992028 


992058 


993549 


994474 


995375 


996126 


996402 


997456 


998440 


999909 


1001242 


1001332 


1003013 


SEQ 

i NO. 
(a.a.) 


4538 


cn 

CO 

tn 

! 


4540 


in 


4542 


4543 


4544 


4545 


4546 


in 


in 


in 


4550 


4551 : 

1 


4552 
4553 


tn 
tn 


SEQ 
NO. 
(DNA) 

1038 


CJ> 
CO 

o 


1040 
1041 


CN 
TT 
O 


CO 
XT 

o 


O 


m 

O 


CO 

o 


o 


00 
XT 
O 


o 


o 
in 
o 


o 


CM 

in 
o 


CO 

m 
o 


1054 
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Function 


hypothetical membrane protein 


two-component system sensor 
histidine kinase 


two component transcriptional 
regulator (luxR family) 




hypothetical membrane protein 


ABC transporter 




ABC transporter 


gamma-glutamyltranspeptidase 
precursor 










transposase protein fragment 


transposase (151628 TnpB) 








transcriptional regulator (TetR- 
family) 


transcriptionyrepair-coupling protein 




15 


Matched 
length 
{a.a) 


CM 
CM 


in 


CM 
O 
CN 




at 

CO 


in 

CO 

in 




ro 
in 


CO 
CO 
CD 










CO 


CO 

n 

CN 








CO 
CO 


1217 




















































imilarit 


CM 




CD 




in 


o 




o 


CO 










o 


O 








CO 






20 


CO 
CO 


<x> 


CO 




CO 


m 






od 
in 










CM 


C3 
O 
^ 








ai 
in 


cri 

CO 




CO 














































Identity 
(%) 


CM 


CO 


CO 




m 


CO 




o 












o 


CO 








o 


CM 






O 
CO 


CM 


CO 

CO 




CO 


od 

CM 






CM 

ro 










CO 


<J> 

Oi 








CO 
CN 


CO 
CO 




25 

a> 
.E 
o 

30 

0) 

35 


Homologous gene 


Streptomyces lividans 0RF2 


Escherichia coli K12 uhpB 


streptomyces peucetius dnrN 

• 




Streptomyces coelicolor A3{2) 
SCF15.07 . 


Streptomyces glaucescens strV 




Mycobacterium smegmatis exiT 


Escherichia coli K12ggt 










Corynebacterium glutamicum 
TnpNC 


Corynebacterium glutamicum 
22243 R-plasmid pAG1 tnpB 








Escherichia coli tetR 


Escherichia coli mfd 
































CO 
CM. 
















40 


db Match 


o 
in 

CO 

o 
— > 


sp:UHPB_EC0LI 


prf:2107255A 




gp:SCF15_7 


pir:S65587 




o 

CO 
CL 


_j 
o 
o 

LU 

a 

CD 

d 










GPU:AF164956. 


gp;AF121000_8 








sp:TETC_EC0LI 


sp:MFD_EC0Ll 






gi 


T— 

n 

CO 


1257 


o 

CO 


o 

CM 


1155 


o 


CO 

in 


ro 


1965 


CD 
CM 


in 


CM 
CJ) 


o> 
o 


CO 
CN 


CO 

o 


CN 

to 


Oi 
in 


CM 
CO 


in 

CD 


3627 


1224 


45 


Terminal 
(nt) 


1004783 


1006085 


1006697 


1006734 


1008152 


1010061 


1008534 


1011790 


1011797 

1 


1014264 


1014343 


1015116 


1 1016560 


1015450 


1015145 


1017018 


1017274 


1018393 


1019066 


1022716 


1019390 


50 


Initial 
(nt) 


1003953 


1004829 


1006089 


1006937 


1006998 


1008622 


1008686 


1010057 


1013761 


1014016 


1014861 


1014925 


1015652 


1015692 


1015852 


1016557 


1017870' 


10180821 


1018416 


1019090 


1020613 




to 2 


4555 


4556 


4557 


4558 


4559 


4560 


4561 


4562 


4563 


4564 


4565 


4566 


4567 


4568 


4569 


4570 


4571 


4572 


4573 


4574 


4575 


55 


SEQ 
NO. 
(DNA) 


1055 


1056 


1057 


1058 


cn 
in 
o 


o 
to 
o 


CO 

o 


CM 
CD 
O 


1063 


CD 
O 


in 

CD 

o 


to 
to 
o 


r- 
to 
o 


CO 
CD 
O 


CD 
(D 
O 


o 

1^ 
o 


o 


<N 
O 


CO 

o 


o 


in 
o 
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Function 


Neisserial polypeptides predicted to 
be useful antigens for vaccines and 
diagnostics 


multidrug resistance-like ATP- 
binding protein. ABC-type transport 
protein 


ABC transporter 


hypothetical membrane protein 




hypothetical protein 






IpqU protein 


enolase (2-phosphogtycerate 
dehydratase){2-phospho-D- 
glycerate hydro-lyase) 


hypothetical protein 


hypothetical protein 


hypothetical protein 


guanosine pentaphosphaiase or 
exopolyphosphatase 




threonine dehydratase 




15 


Matched 
length 


(O 


CM 
CO 

to 


N- 

m 


CO 

to 

CO 




CO 
GO 

T— 






CN 


CN 
CM 






CO 

in 
^t— 


a> 

CM 
CO 




CO 






>» 




































20 


Similarii 
(%) 


69.0 


62.7 


81.9 


100.0 




in 






68.9 


86.0 


58.0 


55.0 


77.8 


55.0 




64.7 






Identity 
(%) 


48.0 


31.3 


50.2 


100.0 




33.4 






46.5 


tn 

(O 


68.0 


31.9 


59.5 


25.2 




30.3 




25 

C 

o 

X2 

55 


Homologous gene 


Neisseria gonorrhoeae 


Escherichia coli mdlB 


Mycobacterium tuberculosis 
H37RV Rv1273c 


Corynebacterium glutamicum 
ATCC 13032 orf3 




Bacillus subtilis yabN 






Mycobacterium tuberculosis 
H37RvRv1 022 IpqU 


Bacillus subtilis eno 


Aeropyrum pernix K1 APE2459 


Mycobacterium tuberculosis 
H37RV Rv1024 


Mycobacterium tuberculosis 
H37RV Rv1025 


Escherichia coli gppA 




CD 
o 

"o 
tj 
to 

Ic 
o 

<D 
XI 

o 

t/i 
LU 




40 


db Match 


o 

CO 

tn 

>- 
CL 


-J 

o 
o 

ir. 

1 

m 
—1 
Q 


YC73_MYCTU 


YLI3_C0RGL 




CO 
m 

1 

CO 

< 

>■ 






:A70623 


:ENO_BACSU 


?:B72477 


C70623 


D70623 


GPPA_ECOLI 




THD2_EC0L! 








CO 
O 


'd. 

V) 


CL 

tn 


ci. 












Q. 
to 


CL 


a. 


■q. 


Q. 
to 




tO 








03 
CN 
CN 


1968 


1731 


2382 


r*- 
a> 

CM 


in 

GO 

in 


to 

CN 


OO 
CO 


to 

00 


1275 




o 
in 


CD 

in 


CO 
CO 

cn 


OO 

o> 


o 

CO 
CJ) 


in 
o 


45 


Terminal 
(nt) 


1021078 


1022699 


1024666 


1026505 


1032181 


1032780 1 


1032760 


1033269 


1034739 


1036223 


1036016 


1036855 


1037445 


1038410 


1036498 


1038721 


1039977 


50 


Initial 
(nt) 


1021305 


1024666 


1026396 


1028886 


1031885 


1032196 


1033185 


1033646 


1 

1033954 


1034949 


10361591 


1036316 


1036900 


1037448 


1037481 


1039650 


1039783 




SEQ 
NO. 
(a.a.) 


4576 


4577 


4578 


4579 


o 

00 

! in 


OO 

m 


4582 


4583 


4584' 


4585 


4586, 


4567 


4588 


4589 


4590^ 


4591 


4592 


55 


SEQ 
NO. 
(DNA) 


CO 

o 


O 


CO 

o 


1079 


1080 


OO 

o 


CM 
OO 
O 


CO 
CO 

o 


1084 


m 

OO 

o 


to 

CO 

o 


CO 

o 


1088 


O) 
00 

o 


o 

Ol 

o 


CD 
O 


CM 
Ol 

o 
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Function 




hypothetical protein 


transcription activator of L-rhamnose 
operon 


hypothetical protein 




hypothetical protein 


transcription elongation factor 


hypothetical protein 


lincomycin-production 




3-deoxy-D-arabino-heptulosonate-7- 
phosphate synthase 




hypothetical protein or undecaprenyl 
pyrophosphate synthetase 


hypothetical protein 






pantothenate kinase 


serine hydroxymethyl transferase 


p-aminobenzoic acid synthase 




15 


Matched 
length 
(aa) 




CD 

lO 


CM 
CM 


CM 
GO 
CN 




O 


CO 


o 


o 
o 

CO 




to 

CO 






CO 
CM 






CO 

o 

CO 


CO 


CO 
O) 
(D 




20 


Similarity 
(%) 






55.8 


80.1 




57.1 


1 60.1 


72.1 


56.3 




99.5 




97.3 


100.0 






a> 


100.0 


70.1 






>* 

Is 




46.3 


24.8 


57.8 




30.0 


O 

tri 

CO 


34.3 


31.7 




99.2 




96.0 


100.0 






53.9 


99.5 


47.6 




Table 1 (continued) 


Homologous gene 




Thermotoga maritima MSB8 


Escherichia coli rhaR 


Mycobacterium tuberculosis 
H37RV Rv1072 




Streptomyces coelicolor A3(2) 
SCF55.3g 


Escherichia coli greA 


Mycobacterium tuberculosis 
H37RV Rv1081c 


Streptomyces fincolnensis ImbE 




Corynebacterium glutamicum 
aroG 




Corynebacterium glutamicum 
CCRC18310 


Corynebacterium glutamicum 
(Brevibacterium flavum) 






Escherichia coli coaA 


Brevibacterium flavum MJ-233 
glyA 


Streptomyces griseus pabS 




40 


db Match 




pir;B72287 , 


sp:RHAR_ECOLi 


pir:F70893 




gp:SCF55_39 


sp:GREA_ECOLI 


pir:G70894 


pir:S44952 




-J 
O 

DC 

o 
o 

o' 

O 
DC 
< 

(/) 




sp.YARF_CORGL 


SP:YARF_CORGL 






_j 
o 

u 

UJ 

s' 

o 
u 
d 

cn 


gsp:R97745 


sp:PABS_STRGR 






ORF 

(DP) 


o 

CO 
CO 


cy> 

oo 


CO 

o> 


CD 
CO 


GO 
CO 


O 

in 


CM 
CM 

in 


CO 

oo 


CO 
00 


CO 
CO 


1098 


CO 
CO 
CD 


in 

CO 




CD 

in 


GO 
CO 


CO 
CO 
CJ> 


1302 


1860 


CO 
CM 


45 


Terminal 
(nt) 


1040325 


1040682 


1041917 


1042842 


1 1042850 


1043298 


1043774 


1044477 


1046030 


1046390 


o 
o 


1046820 


1048501 


1048529 


1049043 


1049068 


1049427 ] 


1051925 


1053880 


1054602 


50 


Initial 
(nt) 


1039996 


CJ> 

o 
o 


1040925 


1042027 


1043236 


1043747 

1 


; 1044295 


1044959 


1045158 


1046073 


1046610 


1047452 


1047827 


1048356 


1048525 


1049385 


1050362 


1050624 


1052021 


10538801 

1 




SEQ 
NO. 
(aa.) 


4593 


<J) 
If) 


4595 


4596 


4597 


CO 

in 


Cl 

to 


4600 


4601 


4602 


4603 


4604 


in 

1 o 

! CD 

j 


4606 


4607 


'4608 


|4609 


4610 


£ 


4612 


55 


»^ 5 S 


I ^ 
1 o 


O 


in 

CD 
O 


CD 

O) 
O 


cn 
o 


GO 

o 


C7i 
CJ) 
O 


o 
o 


o 


CM 
O 


CO 

o 


o 


1105 


CD 
O 


1107 


00 

o 


o 

C5 


o 


^ — 
*— 


CN 
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Function 






phosphtnothricin resistance protin 


hypothetical protein 




hypothetical protein 


lactam utilization protein 


hypothetical membrane protein 






transcriptional regulator 




fumarate hydratase precursor 


NADH-dependent FMN 
oxydoreductase 






reductase 


dibenzothiophene desulfurization 
enzyme A 


dibenzothiophene desulfurization 
enzyme C (DBT sulfur dio>fygenase) 


dibenzothiophene desulfurization 
enzyme C (DBT sulfur dioxygenase) 






Matched 
length 
(aa) 






lO 
CO 


o 
o 
cn 




in 

CN 
CN 


CO 
CN 


m 
<o 
^ — 






o 

CN 




CO 

in 


CD 

in 






CO 


CO 


CN 
CO 


CD 
CO 






Similarity 
(%) 






58.8 


o 
cr> 
m 




57.8 


52.2 


81.2 






63.2 




79.4 


65.4 






81.0 


67.7 


51.3 


61.6 






Identity 
{%) 






30.3 


30.3 




37,8 


30.8 


40.6 






26.0 




52.0 


32.7 






55.4 


39.1 


25.8 


26.9 






Homologous gene 






Alcaligenes faecalis ptcR 


Escherichia coli ybgK 




Escherichia coli ybgJ 


Emericeila nidulans lamB 


Bacillus subtilis ycsH 






Bacillus subtilis ydhC 




Rattus norvegicus (Rat) fumH 


Rhodococcus erythropolis 
IGTS8 dszD 






Streptomyces coelicolor A3(2) 
StAH10.16 


Rhodococcus sp. 1GTS8 soxA 


Rhodococcus sp. IGTS8 soxC 


Rhodococcus sp. IGTS8 soxC 






db Match 






o 

CO 

O 
< 
CL 
CD 


sp:YBGK_ECOLI 




—J 
O 
o 
us 

t 

-> 

o 

CD 

CL 


UJ 

:g 

LU 

CL 
tn 


O 
< 

CO 

x' 

tn 
o 
>- 

CL 
t/) 






CO 

o' 

X 

a 

d. 

tA 




sp:FUMH_RAT 


cn 

CO 

o 

Li. 
< 

id. 

O) 






gp:SCAH10J6 


sp:SOXA_RH0S0 


sp:SOXC_RHOS0 


o 

CO 

O 
X 

q: 

t 

o 

X 

o 
tp 

id. 
cn 






ORF 
(bp) 


CO 
oo 


CD 


cn 
in 


Oi 
CO 


1056 


a> 

CO 
CO 


CO 

in 


T— 

C7) 

in 


CN 
CO 


CO 

o 

(O 


CO 
CD 


1278 


1419 


cn 

CO 


5 

CN 


V 


CO 

in 


1488 


1080 


1197 


CD 
CO 

r- 


o 

CD 
CO 


Terminal 
(nt) 


1055722 


1054640 


1056319 


1056322 


1058628 


1057200 


1 1057843 


1058624 


1059889 


1059962 


1060792 


1062146 


1062211 


1064424 


1064478 


1064754 


1065304 


1067570 


1068649 


1069845 


1068913 


1069119 


Initial 
(nt) 


1054859 


1055032 


1055783 


1057200 


1057573 


1057868 


1058598 


1059214 


1059218 


1 1059360 


1060112 


1060869 


1063629 


1063936 


1064738 


1065200 


1065867 


1066083 


1067570 


1068649 


1069692 


1069608 
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NO. 
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CO 


4614 


4615 


4616 


h- 
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to 
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Oi 
CO 
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14622 
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CO 
CO 
CO 
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m 
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T 
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o 

CM 


CN 


CM 
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CN 


OJ 


in 

CN 


ID 
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CN 
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CN 
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Function 


FMNH2-dependent aliphatic 
sulfonate monooxygenase 


glycerol metabolism 


hypothetical protein 


hypothetical protein 




transmembrane efflux protein 


.tr 
C 

13 

Xi 

(A 

E 

<A 

0> 
CO 
CTJ 
O 

o 

o 

ja 

o 

CD 
T3 
O 
X 

a> 


exodeoxyribonuclease large subunlt 


penicillin tolerance 


polypeptides predicted to be useful 
antigens for vaccines and 
diagnostics 




permease 




sodium-dependent proline 
transporter 


major secreted protein PS1 protein 
precursor 


GTP-binding protein 


virulence-associated protein | 


ornithine carbamoyltransferase 


hypothetical protein | 
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length 
(aa) 


OJ 

cn 


lO 
CN 

cn 


CN 


CN 
CN 




CN 
CO 


to 


(O 

to 
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<n 
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CO 




CN 

in 
in 


CN 

T— 


1^ 
to 

CO 


tn 


o 

ro 


ro 


Similarity 
(%) 


73.1 


75.7 


56.4 


66.1 




78.1 


67.7 


55.6 


78.8 


47.0 




63.9 




61.4 


60.0 


88.6 


o 

d 
oo 


58.8 


69.9 


Identity 
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45.3 


44.3 


27.5 


31.3 




36.6 


CO 


30.0 


CM 
d 
to 


33.0 




26.3 




30.3 


29.9 


<=> 


57.3 


29.6 1 


39.2 


Homologous gene 


Escherichia coli K12 ssuD 


Escherichia coti K12 gIpX 


Mycobacterium tuberculosis 
H37RvRv1100 


Bacillus subtilis ywmD 




Streptomyces coelicolor A3(2) 
SCH24.37 


Escherichia coli K12 MG1655 
xseB 


Escherichia coli K12MG1655 
xseA 


Escherichia coli K1 2 lytB 


Neisseria gonorrhoeae 




■Escherichia coli K12perM 




Rattus norvegicus (Rat) SLC6A7 
ntpR 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 cspl 


' Bacillus subtilis yyaF 


: Dichelobacter nodosus intA 


Pseudomonas aeruginosa argF 


Bacillus subtilis 168 ykkB 
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CM 

oo 
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1080786 


1080972 


1082951 
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1 
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T 
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Function 


biotin carboxylase 












hypothetical protein 


magnesium chelatase subunit 


2,3-PDG dependent 
phosphoglycerate mutase 


hypothetical protein 


carboxyphosphonoenolpyruvate 
phosphonomutase 


tyrosin resistance ATP-binding 
protein 


hypothetical protein 


alkylphosphonate uptake protein 


transcriptional regulator 


multi-drug resistance efflux pump 


transposase (Insertion sequence 
IS31831) 
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length 
(a.a) 
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:3 

"c 
o 

30 y- 

35 


Homologous gene 


Synechococcus sp. PCC 7942 
accC 












Mycobacterium tuberculosis 
H37RV RV0959 


Rhodobacter sphaeroides ATCC 
17023 bchI 


Arnycolatopsis methanolica pgm 


Mycobacterium tuberculosis 
H37RV Rv2133c 


Streptomyces hygroscopicus 
SF1293 BcpA 


Streptomyces fradiae tIrC 


Mycobacterium tuberculosis 
H37RV Rv2923c 


Escherichia coli K12MG1655 
phnA 


Bacillus subtilis 168 yxaD 


Streptococcus pneumoniae 
pmrA 


Corynebacterium glutamicum 
(Brevibacterium lactofermentum) 
ATCC 31831 
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ID 
\- 
O 
> 

1 

in 
^— 
I- 
> 


X 

t/i 
O 
X 

ce 

i 

X 

a 

CO 


:AMU73808J 


A70577 


STMBCPAJ 


TLRC.STRFR 


D 
1- 
U 
> 

o» 

CO 

o 


PHNA^ECOLI 


D 
CO 

o 
< 

CD 

1 

Q 
> 


CO 
CO 

a. 

CO 


543613 






Q- 
OI 












:ds 


id. 
(/) 


CL 
O) 


"q. 


Q. 
OI 


CL 
</> 


lA 


d. 
m 


bL 
(/> 


cL 

Ol 


"5. 




u 


1737 


O) 

in 


CO 


in 

CO 


co 
in 


O) 
CO 
CD 


1956 


1296 


CN 
CO 


in 
o 


CN 
CO 
1^ 


1641 


CO 
O) 
CO 


CN 
CO 
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103524 
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105561 
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106086 
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111432 
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112230 


CO 
CN 


114319 


115793 
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' (nt) 
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1102043 
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i 1103180 


! 1103951 


1104923 


1106058 
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1108201 


1108993 
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1113102 
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lO 
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to 
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(DNA) 


1 






p- 


m 


CD 


P- 


1178 


cn 


1180 


CO 


1 
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Function 


cysteine desulphurase 


nicotinate-nucleotide 
pyrophosphorylase 


quinolinate synthetase A 


DNA hydrolase 


hypothetical membrane protein 


hypothetical protein 


hypothetical protein 


lipoate-protein ligase A 


alkylphosphonate uptake protein 
and C-P lyase activity 


transmembrane transport protein or 
4-hydroxyben2oate transporter 


p-hydroxybenzoate hydroxylase (4- 
hydroxybenzoate 3- 
monooxygenase) 


hypothetical membrane protein 


ABC transporter ATP-binding protein 


hypothetical membrane protein 




Ca2+/H+ antiporter ChaA 


c 

OJ 
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CI. 

TO 
O 

"o 


hypothetical membrane protein | 
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length 
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77.6 


60.9 
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61.6 
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CO 

in 


CO 
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43.g 


CN 
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37.0 


23.4 


36.0 
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30.1 I 
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28.8 


40.8 


36.7 


24.8 


CO 

in 

CN 




33.3 


od 

CN 


27.6 


Honnologous gene 


Ruminococcus flavefaciens 
cysteine desulphurase gene 


Mycobacterium tuberculosis 


Bacillus subtilis nadA 


Streptomyces coelicolor 
SC5B8.07 


Deinococcus radiodurans R1 
DR1112 


Streptomyces coelicolor 
SC3A7.08 


Escherichia coli K12 MG1655 
ybdF 


Escherichia coli K1 2 IplA 


Escherichia coli K12 phnB 


Pseudomonas putida pcaK 


Pseudomonas aeruginosa phhy 


Bacillus subtilis 168 ykoE 


Escherichia coli yjjK 


Bacillus subtilis 168 ykoC 




Escherichia coli chaA 


Pyrococcus abyssi Orsay 
PAB1341 


Bacillus subtilis ywaF 


db Match 


gp;RFAJ3152_2 


H- 
O 
> 

1 

o . 

Q 
< 

id. 


pir:E69663 


CO 
in 
O 
If) 
b. 

O) 


m 

i 

r— 
CD 

cn 

o 
o 

UJ 

< 

icL 

CJl 


gp:SC3A7_8 


sp:YBDF_ECOLI 


gp:AAA21740J 


_j 
o 
o 

LU 

1 

CO 

D_ 

Q. 
t/) 


sp:PCAK_PSEPU 


HI 

< 

LU 
CO 

> 
X 
X 
Q- 

inL 


pir:A69859 


sp:YJJK_ECOLI 


pir:G69858 




_i 

o 
o 

LU 

s' 

X 

o 

id. 


pir;C75001 


CO 

< 

id- 
tn 




ORF 
(bp) 


1074 


CO 

oo 


1182 


CN 
CO 


o 
o 

CO 


o 
o 

CO 


CM 
CO 


(31 
00 




1293 


1185 


oo 
oo 
m 


1338 


CO 

ir> 


CO 

tn 


1050 


oo 
o 


CO 
CN 




Ternninai 
(nt) 


1115832 


1116908 


1117751 


1119086 


1120804 


1120833 
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1127009 


1128350 
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1130704 


1131428 


1131401 
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1116905 


1117744 
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1120205 
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Function 


excinuclease ABC subunit A 


thioredoxin peroxidase 






hypothetical membrane protein 


oxidoreductase or thiamin 
biosynthesis protein 










chymotrypsin Bll | 


arsenate reductase (arsenical pump 
modifier) 


hypothetical membrane protein | 


hypothetical protein 


hypothetical protein 


GTP-binding protein (tyrosine 
phsphorylated protein A) 


hypothetical protein 


hypothetical protein 




ferredoxin (4Fe-4S| 
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length 
(aa) 


to 
o> 


to 






oo 

CO 


CN 
CD 
CN 










CN 




o 

CO 




^ — 

CN 
CN 




'90S 


tn 

CO 




CO 

o 


20 


Similarity 
(%) 


58.7 


81.7 






72.0 


49.0 










51.3 


72.1 


62.4 


71.4 


62.9 


76.7 


54.9 


61.9 
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Homologous gene 


Thermus thermophilus unrA 


Mycobacterium tuberculosis 
H37Rvtpx 






Escherichia coli yedL 


Streptomyces coelicolor A3(2) 
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Escherichia coli 


Bacillus subtilis yyaD 


^f^ycobacterium tuberculosis 
H37RV Rv1632c 


Mycobacterium tuberculosis 
H37RV Rv1157c 


Escherichia coli K12typA 


Mycobacterium tuberculosis 
H37RV Rv1166 


Mycobacterium tuberculosis 
H37RV RV1170 




Streptomyces griseus fer 
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Function 


hypothetical protein 


ATPase 


hypothetical protein 


hypothetical protein 


hypothetical protein 






2-oxoglutarate dehydrogenase 


ABC transporter or multidrug 
resistance protein 2 (P-glycoprotein 
2) 


hypothetical protein 


shikimate dehydrogenase 


para-nitrobenzyl esterase 








tetracycline resistance protein 


metabolite export pump of 
tetracenomycia C resistance 
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64.7 








61.4 


64.2 






Identity 
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45.5 


43.6 


60.4 


49.8 


57.9 
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28.8 


31.7 


25.5 


35.7 








27.1 


32,4 




Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv1224 


Escherichia coli mrp 


Mycobacterium tuberculosis 
H37RV Rv1231c 


Mycobacterium tuberculosis 
H37RV Rv1232c 


Mycobacterium tuberculosis 
H37RV Rv1234 






Corynebacterium glutamicum 
AJ12036 OdhA 


Cricetulus griseus (Chinese 
hamster) MDR2 


Mycobacterium tuberculosis 
H37RvRv1249c 


Escherichia coli aroE 


Bacillus subtilis pnbA 








Escherichia coli transposon 
Tn1721 tetA 


Streptomyces glaucescens tcmA 
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Function 


DEAD box ATP-dependent RNA 
helicase 


bacterial regulatory protein. tetR 
family 


pentachlorophenol 4- 
monooxygenase 


maleylacetate reductase 


catechol 1,2-dioxygenase 




hypothetical protein 


transcriptional regulator 




hypothetical protein 


phosphoesterase 


hypothetical protein 






esterase or lipase 
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Similarity 
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Table 1 (continued) 


Homologous gene 


Klebsiella pneumoniae CG43 
: DEAD box ATP-dependent RNA 
helicase deaD 


Mycobacterium leprae 
EI1308_C2_181 


[ Sphingomonas flava pcpB 


Pseudomonas sp. B13 cIcE 


Acinetobacter calcoaceticus 
catA 




Mycobacterium tuberculosis 
H37RV Rv2972c 


Saccharomyces cerevisiae 
SNF2 




Streptomyces coelicolor A3(2) 
orfZ 


Mycobacterium tuberculosis 
H37RvRv1277 


Mycobacterium tuberculosis 
H37Rv Rv1278 






Petroleum-degrading bacterium 
HD-1 hde 
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1214871 
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1219895 
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Function 


short-chain fatty acids transporter 


regulatory protein 






fumarate (and nitrate) reduction 
regulatory protein 


mercuric transort protein periplasmic 
component precursor 


zinc-transporting ATPase Zn(ll}- 
translocating P-type ATPase 


GTP pyrophosphokinase (ATP:GTP 
3'-pyrophosphotransferase) (ppGpp 
synthetase 1 ) 


tnpeptidyl aminopeptidase 






homoserine dehydrogenase 






nitrate reductase gamma chain 


nitrate reductase delta chain 


nitrate reductase beta chain 


hypothetical protein 


hypothetical protein 


nitrate reductase alpha chain 


nitrate extrusion protein | 
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length 
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Efwinia chrysanthemi recS 






Escherichia coli K12 MG1655 fnr 


Shewanella putrefaciens merP 


Escherichia coli K12MG1655 
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Vibrio sp. S14 relA 


Streptomyces lividans tap 
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Bacillus subtilis narl 
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Aeropyrum pernix K1 APE1291 


Aeropyrum pernix K1 APE1289 
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Function 


molybdopterin biosynthesis cnx1 
protein (molybdenum cofactor 
biosynthesis enzyme cnx1) 


extracellular serine protease 
precurosor 




1 

hypothetical membrane protein 


1 hypothetical membrane protein 


molybdopterin guanine dtnucleotide 
synthase 


molybdoptein biosynthesis protein 


molybdopterin biosynthsisi protein 
Moybdenume (mosybdenum 
cofastor biosythesis enzyme) 


edium-chain fatty acid-CoA ligase 


Rho factor 








peptide chain release factor 1 | 


protoporphyrinogen oxidase 




hypothetical protein 


undecaprenyl-phosphate alpha-N- 
acetylglucosamtnyltransferase 
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Table 1 (continued) 


Homologous gene 


Arabidopsis thaliana CV cnx1 


Serratia marcescens strain IFO- 
3046 prtS 




Mycobacterium tuberculosis 
H37RV Rv1841c 


Mycobacterium tuberculosis 
H37RV Rv1842c 


Pseudomonas putida mobA 


Mycobacterium tuberculosis 
H37Rv Rv0438c moeA 


Arabidopsis thaliana cnx2 


Pseudomonas oteovorans 


Micrococcus luteus rho 








Escherichia coli K12 RF-1 


Escherichia coli K12 




Mycobacterium tuberculosis 
H37RV Rv1301 


Escherichia coli K12rfe 
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Function 




hypothetical protein 


ATP synthase chain a (protein 6) | 


H+-transporting ATP synthase lipld- 
binding protein. ATP synthase C 
chane 


H+-transporting ATP synthase chain 
b 


H+-transporting ATP synthase delta 
chain 


H+-transporting ATP synthase alpha 
chain 


H+-transpor1ing ATP synthase 
gamma chain 


H+-transport(ng ATP synthase beta 
chain 


H+-tran sporting ATP synthase 
epsilon chain 


hypothetical protein 


hypothetical protein 


putative ATP/GTP-binding protein 


hypothetical protein j 


hypothetical protein 


thioredoxin 
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Table 1 (continued) 


Honfiologous gene 




Corynebacterium glutamicum 


[Escherichia coli K12atpB 


Streptomyces lividans atpL 


Streptomyces lividans atpF 


Streptomyces lividans atpD 


streptomyces lividans atpA 


streptomyces lividans atpG 


Corynebacterium glutamicum 
ASOig atpB 


Streptomyces lividans atpE 


Mycobacterium tuberculosis 
H37RV Rv1312 


Mycobacterium tuberculosis 
H37RV Rv1321 


Streptomyces coelicolor A3(2) 


Eiacillus subtilis yqjC 


Mycobacterium tuberculosis 
H37RvRv1898 " 


Mycobacterium tuberculosis 
H37RV Rvi324 
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FMNH2-dependent aliphatic 
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glucose-resistance amylase 
regulator (catabolite control protein) 


ripose transport ATP-binding protein 


high affinity ribose transport protein 


periplasmic ribose-binding protein 


high affinity ribose transport protein 


hypothetical protein 


iron-siderophore binding lipoprotein 


Na-dependent bile acid transporter 


RNA-dependent amidotransferase B 


putative F420-dependent NADH 
reductase 


hypothetical protein 
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hypothetical membrane protein 
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hypothetical membrane protein 


hypothetical protein 




nitrate transport ATP-binding potein 


maltose/maltodextrin transport ATP- 
binding protein 


nitrate transporter protein 






actinorhodin polyketide dimerase ] 


cobalt-zinc-cadimium resistance 
protein 






hypothetical protein | 




D-3-phosphoglycerate 
dehydrogenase 


hypothetical serine-rich protein 






hypothetical protein 




15 


Matched 
length 
(aa) 


(O 


iO 




to 
^ 


CO 


CN 
CO 






Ol 


o 

CO 






CN 

to 




o 

CO 

in 


m 
o 






o 

OI 
CO 




20 


Similarity 
(%) 


100.0 


55.0 




80.8 


78.2 


56.8 






73.2 


72.7 






53.7 




100.0 


52.0 






63.1 






Identity 
(%) 


100.0 


45.0 




50.9 


46.0 


CO 
CN 






39.4 


39.1 






22.9 




99.8 


29.0 






32.9 




25 

.E 
c 
o 

30 ^ 
40 


Homologous gene 


Corynebacterium glutamicum 
ATCC 13032yilV 


Sulfolobus solfataricus 




Synechococcus sp. nrtD 


Enterobacter aerogenes 
(Aerobacter aerogenes) malK 


Anabaena sp. strain PCC 7120 
nrtA 






Streptomyces coelicolor 


Ralstonia eutropha czcD 






Methanococcus jannaschii 




Brevibacterium flavum serA 


Schizosaccharomyces pombe 
SPAC11G7.01 






Rhodobacter capsulatus strain 
SB1003 




db Match 


sp:YILV_CORGL 


GP:SSU18930 26 
3 




Q. 
Z 
> 
CO 

q' 

tr 

id. 


LU 
< 

_J 
< 

a. 


sp:NRTA_ANASP 






O 
o 
oc 
1- 
(/) 

co' 

Q 
to 


sp:CZCD_ALCEU 






sp:Y686_METJA 




gsp:Y22646 


O 

Q. 
X 

u 
z 

UJ 

> 

CO 






CD 
CO 

o 
k- 

l: 

Q. 






ORF 

(Dp) 


1473 


CO 
OJ 


to 
o 

<D 


CO 

o 


CO 
CN 


(N 
CO 
00 




cn 

CD 
CO 


to 

CO 

TT 


XT 
in 

C3> 


CO 

m 


O 

to 


1815 


1743! 


1590; 


CN 
CO 


r- 
co 

CO 


1062 


1866 


CN 
O 


45 


Terminal 
(nt) 


1336095 


1338379 


1342677 


1341950 


1342461 


1342794 


1344464 


1344808 


1345420 


1346439 


1345335 


1345642 


1348272 


1350076 


1352444 


1351727 


1353451 


1354540 


1357554 


1356853 


50 


Initial 
(nt) 


1337567 


1338609 


1342072 


1342457 


1342727 


1343675 


1344018 


1344440 


1344935; 


1345486 


1345487 


1346331 


1346458 


1348334 


1350855 


1352053 


1352585 


1355601 


1355689 


1356452 




SEQ 

NO. 
(a.a.) 


t— 

Q 

cn 


4902 


4903 


o 
o 


4905 


4906 


4907 


4908 


4909 


4910 




4912 


fO 
CD 


cn 


4915 


4916 


4917 


CO 

cn 


4919 


4920 


55 




o 


CN 
O 


CO 

o 


O 


1405 


to 
o 


o 


CO 

o 


a) 
o 


o 




CN 
t— 


CO 




in 


CO 




CO 
t3- 




1420 



110 



EP1 108 790 A2 



O 



10 



15 



5 «> 



20 



25 



30 



T3 

13 
C 



O 

o 



.to 



C7> 



O 

E 
o 
X 



35 



40 



§1: 



45 



75 



75 

lie 
u 

TO 
O 
O 

Is 

11 

x: -Q 



II 

a> d> 

a> c 2 
ra a> oj 

g^'-? E 
o o 

0) cu* 

CO $^ CO 

(Do*"' 

E>^E 
o o 



c 
u 

6 ^ 
" 2 * 

O T3 -O 



00 
CM 
CN 



CM 



o 



50 



55 



So s 

CO Z «. 



O H < 

LU O ^ 

CO 2 Q 



O 
O 

UJ 

I 

UJ 

o 

X 



o 
. ^ 

<U O V 

</» r- «/» 

^ 3 ^ 

"(rt .£1 '«» 
CSC 

til" 

0) t a, 
e -g E 



CM 



CM 

"5 
o 

03 



o 

O 
liJ 

o' 

m 

id. 



CM 

to 

O 



O) 

in 

fsl 
CO 

tn 

CO 



CM 



CM 

CM 



00 

5 



cn 
to 
(o 

O) 

m 



CM 

o 



EC 



00 CO 

to ^ 

1- oo 

O CM 

(O to 

n n 



CM 



CN 



t/3 



CD 

CM CM 



fcO to 
CN CM 



C 

o 

a. 



o 

Jo 
c 

E 

.2 
Ic 



o 

CO 



in 

CO 



O 
Ic 



O 

< 



3 

CD 



D 
CO 
O 
< 
CO 

o 

X 



00 



to 

CO 



CN 
CO 



tn 

O) 
CO 
00 

to 

CO 



in 
in 



CO 

tD 
CO 



00 

to 
to 



o 

CO 

to 



o 
oo 
to 

CO 



tj> 
to 

CO 



CO 



CO 
CO CO 



CO 



to 

CO 



CO 



CO 



to 

CO 



CO 



CO 



111 



EP1 108 790 A2 



5 



10 



15 



20 



25 



1 



OJ 
X3 



35 



40 



45 



50 



55 



Function 






lipoprotein 




glycogen phosphorylase 






hypothetical protein 


hypothetical membrane protein 
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hypothetical protein 










hemolysin 


1 hemolysin 




DEAD box RNA helicase 


ABC transporter ATP-binding protein 


6-phosphogluconate dehydrogenase 


thioesterase 
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phosphomethylpyrimidine kinase 


hydoxyethylthiazole kinase 


cyclopropane-fatty-acyl-phospholipid 
synthase 
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virulence-associated protein 
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Function 


methylmalonyl-CoA mutase beta 
subunit 


hypothetical membrane protein 




hypothetical membrane protein 


hypothetical membrane protein 


hypothetical protein 




ferrochelatase 


invastn 




aconitate hydratase 


transcriptional regulator 


GMP synthetase 


hypothetical protein 


hypothetical protein 




hypothetical protein 
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Function 


antigenic protein 


antigenic protein 


cation-transporting ATPase P 




hypothetical protein 










host cell surface-exposed lipoprotein 


integrase 


ABC transporter ATP-bindIng protein 




sialidase 


transposase (IS1628) 


transposase protein fragment 


hypothetical protein 




dTDP-4-keto-L-rhamnose reductase 


nitrogen fixation protein 


15 


Matched 
length 
(aa) 


CO 


CN 

in 


cn 
oo 

00 




o 

CM 










O 


in 


CD 




00 
CO 


CO 
CO 
CN 


r- 

CO 


CO 
00 




o 




20 


Similarity 
{%) 


o 
o 

CD 


69.0 


73.2 




58.3 










73.8 


60.4 


64.4 




72.4 


100.0 


72.0 


43.0 




70.1 


85.2 




Identity 
(%) 


54.0 


59.0 


42.6 




35.8 










43.0 


34.4 ' 


32.8 




C3> 


99.6 


64.0 


32.0 




32.7 


63.8 


25 

0) 

ID 

.E 
"c 
o 

30 ^ 
JD 

35 
40 


Homologous gene 


Neisseria gonorrhoeae ORF24 
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Function 


hypothetical protein 


nitrogen fixation protein 


ABC transporter ATP-binding 


hypothetical protein 


ABC transporter 


DNA-binding protein 


hypothetical membrane protei 


ABC transporter 


hypothetical protein 


hypothetical protein 




helicase 
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Function 


glucose-6-phosphate 
dehydrogenase 


oxppcycle protein (glucose 6- 
phosphate dehydrogenase 
assembly protein) 


6-phosphogluconolactonase 


sarcosine oxidase 


transposase(IS1676) 


sarcosine oxidase 

1 








triose-phosphate isomerase 


probable membrane protein 


phosphoglycerate kinase 


glyceraldehyde-3-phosphate 
dehydrogenase 


hypothetical protein 


hypothetical protein 


hypothetical protein 


excinuclease ABC subunit C 
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Function 


hypothetical protein 


6,7-dimethyl-8-ribityHuma2ine 
synthase 


polypeptide encoded by no operon 


riboflavin biosynthetic protein 


polypeptide encoded by nb operon 


GTP cyclohydrolase II and 3. 4- 
dihydroxy-2-butanone 4-phosphate 
synthase (riboflavin synthesis) 


riboflavin synthase alpha chain 


riboflavin-specific deaminase 


ribulose-phosphate 3-epimerase 


nucleolar protein NOLI/NOP-^ 
(eukaryotes) family 


methionyl-tRNA formyltransferase 


polypeptide defomnylase 


primosomal protein n 


S-adenosylmethionlne synthetase 


DNA/pantothenate metabolism 
flavoprotein 


hypothetical protein 


guanylate kinase 


integration host factor 
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Escherichia coli K12 
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Bacillus subtilis 
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Mycobacterium tuberculosis ribA 


Actinobacillus 
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Escherichia coli K12 ribD 


Saccharomyces cerevisiae 
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Brevibaderium flavum MJ-233 


Mycobacterium tuberculosis 
H37RV RV13B1 dfp 


Mycobacterium tuberculosis 
H37RV Rvl390 


Saccharomyces cerevisiae guki 


Mycobacterium tuberculosis 
H37RV Rv1388 mlHF 


40 


db Match 


(- 
o 
> 

m* 
<o 

> 

idu 


_j 
o 
o 

LU 

m' 

w 

cc 

b. 
(/] 


GSP:Y83273 


GSP:Y83272 


GSP:Y83273 


gp:AF001929_1 


sp:RISA_ACTPL 


sp:RlBD ECOU 


1- 
< 

LU 

CC 

cL 
t/) 


—I 
o 
a 

LU 

z 

ID 

C/) 

Q. 

tfi 


sp:FMT PSEAE 


sp:DEF BACSU 


sp:PRIA ECOLI 


gsp:R80060 


3 
l_ 

O 
>- 

:e 

\ 

Q. 
b. 
O 

ci. 
vt 


sp:YO90_MYCTU 


pirKIBYGU 


pir:B70899 






o> 
in 




CO 

CN 
CN 




CD 
CO 
CO 


1266 


CO 
CO 
CO 


CO 


to 

CD 


1332 


m 

C3> 


o 
in 


2064 


1221 


1260 


5> 

CM 


h- 

CM 
CO 


CD 
CO 


45 


Terminal 
(nt) 


1689201 


1689869 


1690921 


1691421 


1691347 


1690360 


1691639 


1692275 


1693262 


1693967 


1695499 


1696466 


1697084 


1699177 


1700508 


1702032 


1702411 


1702991 


50 


initial 
(nt) 


1689779 


1690345 


1690694 


1690708 


1691012 


1691625 


1692271 


1693258 


1693918 


1695298 


1696443 


1696972 


1699147 


1700397 


1701767 


1702322 


1703037 


1703308 




So « 


5260 


5261 


CN 

cc 

CN 

in 


cn 
cc 

CN 

ir 


► IT 
1 CN 

> IT 


5265 


5266 


(D 
CN 
Lfl 


5263 


5269 


o 

CN 

in 


CN 

in 


CN 

r- 

CN 

in 


m 
r-- 

CN 

in 


5274 


5275 


5276 


5277 


55 




1760 


1761 


CN 

cc 


CC 


> 

> CC 


1765 


1766 


1^ 
CC 


1768 


1769 


c 
r- 


f- 


CN 


1 n 


1774 


1775 


CO 

r- 


1777 



129 



EP 1 108 790 A2 



5 
10 


Function 


1 orotidine-5'-phosphate 
1 decarboxylase 


carbamoyl-phosphate synthase 
large chain 


carbamoyl-phosphate synthase 
small chain 
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aspartate carbamoyltransferase 


phosphoribosyl transferase or 
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bacterial regulatory protein, arsR 
family 


ABC transporter 




iron{(ll)ABC transporter, 
periplasmic-binding protein 


ferrichrome transport ATP-binding 
protein 


shikimate 5-dehydrogenase 


hypottietica! protein 


hypothetical protein 
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oxidoreductase 




NADH-dependent FMN reductase 


L-serine dehydratase 




alpha-glycerolphosphate oxidase 


histidyl-tRNA synthetase 


hydrolase 


cyciophilin 




hypothetical protein 
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Table 1 (continued) 
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sporulation transcription factor 
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hypothetical protein 


insertion element (IS3 related) 


insertion element (IS3 related) 






single-stranded-DNA-specific 
exonuclease 




primase 


15 




Matched 
length 
(a.a) 


CD 
CN 


















in 










CO 
CD 


CO 

cn 

CN 


o 






CN 
CM 
CD 




cn 


20 




Similarity 
(%) 


65.7 


















55.2 










75.0 


95.6 


84.2 






CD 
O 

m 




64.3 






Identity 
(%) 


34.3 


















22.6 










1 63.0 


87.9 


72.3 






24.0 




31.8 


25 
30 
35 


Table 1 (continued) 


Homologous gene 


Streptomyces coelicolor A3(2) 
WhiH 


















Thermotoga maritima MSB8 
TM1189 










Corynebacterium glutamicum 


Corynebacterium glutamicum 
orf2 


Corynebacterium glutamicum 
orfl 






Erwinia chrysanthemi recJ 




Streptococcus phage phi-O1205 
0RF13 


40 




db Match 


gp:SCA32WHIH_6 


















pir:C72285 










PIR:S60B91 . 


pir:S60890 


<7) 
GO 
CO 

o 

CO 

CO 

l: 
'o. 






sp:RECJ.ERWCH 




pir;T13302 






ORF 


GO 
CO 


CJ) 
CO 


CO 

in 


CD 
CO 


CN 
CD 




m 

CO 


CD 
CO 
CO 


o 

CN 


2202 


1746 


CJ> 
CN 




O) 
CN 
N" 


ro 
in 


C3) 
CD 


rr 
Oi 
CN 


CO 
r~ 
CM 


1299 


1878 


o 

GO 


1650 


45 




Terminal 
(nt) 


1814517 


1815651 


1816128 


1816636 


1817803 


1818219 


1818774 


1819166 


1819748 


1820181 


1824322 


1824589 


1824927 


1825178 1 


! 1826557 


1825751 


1826644 


1829688 


1832063 


1834044 


1834149 


1838324 


50 




Initial 
(nt) 


1813780 


1814863 


1815673 


1816451 


1817132 


1817803 


o 

CD 
00 
00 


1818798 


1819954 


1822382 


18225771 


1824371 


1824784 


1825606 


1826024 


1826644 


1826937 


1829900 


1830765 


1832167 


1834928 


1836675 






SEQ 

NO. 
(a.a.) 


5396 


5397 


5398 


5399 


5400 


5401 


5402 


5403 


5404 


5405 


5406 


5407 


00 
O 

in 


5409 


o 
in 


5411 


5412 


|5413 


5414 


5415 


5416 


5417 


55 




LU O ^ 


CD 
CJ> 
CO 


1^ 

C37 
CO 


GO 

o> 

CO 


Oi 
Gi 
GO 


o 
o 
cn 


o 
cn 


CN 
O 

cr> 


CO 

o 

C75 


1904 


in 
o 

2 


CD 
O 
O) 


r-. 
o 
cn 


CO 

o 
en 


CJ> 

o 
o> 


o 
cn 




CN 

<n 


CO 
CD 


C7) 


m 
cn 


CD 
CD 


CJ> 



136 



EP1 108 790 A2 



















































ATP- 


5 


































SH3 














ATP-dependent CIp proteinase 
binding subunit 


10 


Function 








helicase 




phage N15 protein gp57 




















actin binding protein with 
domains 










ATP/GTP binding protein 




15 


Matched 
length 
(aa) 








o 

CM 
CO 




CD 
O 




















CN 
CM 










cn 




o 

CO 
CO 




















































20 


Similar 
(%) 












64.2 




















49.8 










52.5 




q 

CD 






















































1? 








CN 
CN 




36.7 




















28.7 










23.6 




30.2 


25 

0) 

-E 

35 


Homologous gene 








Mycoplasma pneumoniae ATCC 
29342 yb95 




Bacteriophage N1 5 gene57 




















Schizosaccharomyces pombe 
SPAPJ760.02C 










Streptomyces coelicolor 
SC5C7.14 




Escherichia coli K12clpA 


40 


db Match 








□. 
U 

> 

co' 

o 
> 

CL 

to 




pir:T13144 




















gp:SPAPJ760_2 










a 
m 
O 
to 

CL 

cn 




sp:CLPA_ECOLI 






3789 


TT 


ro 
in 


1839 


CO 


(O 
CO 
CO 


CO 
CD 
CO 


CO 
CO 


CO 

tn 


CO 
CN 

tn 


CO 

cn 


CO 
CO 


CM 
CO 


CO 
CO 


CO 


1221 


CN 

in 

CO 


1395 


<n 
to 


o 

OO 


1257 


in 
oo 


1965 


45 


Terminal 
(nt) 


1842137 


1842681 


1843337 


1845356 


1845857 


1846207 


1846333 


1847932 


CO 
OO 


1849036 


1849785 


1849966 


1850406 


1849978 


1850474 


1852440 


1852324 


1853873 


1854854 


1855237 


1856788 


1858738 1 


1860727 


50 


Initial 
(nt) 


1838349 


1842235 


1842804 


1843518 


ro 
CD 

in 

OO 


1845872 


1846698 


1847315 


1847938 


1848509 


1848988 


CO 
C7> 
CO 


1850035 


1850415 


1851049 


1851220 


1851473 


1852479 


1854261 


1855058 


1855532 


1856885 


1858763 1 


1 


SEQ 

NO. 
(a.a.) 


5418 


5419 


5420 


5421 


5422 


5423 


5424 


5425 


5426 


5427! 


5428 


5429 


o 

CO 

in 


5431 


5432 


5433 


5434 


5435 


5436 


CO 

in 


5438 


5439 


o 
in 


55 


SEQ 

NO. 


oo 


cn 


si 


CN 
C3) 


CN 
CN 

cn 


CO 
CN 

cn 


1924 


in 

CN 


CO 
CN 
CD 


1927 ' 


CO 
CN 
O) 


1929 


o 

CO 

cn 


^* 

CO 


CN 
ro 
cn 


CO 
CO 
CD 


CO 
CJ) 


m 

CO 


ID 
CO 


CO 
<7> 


CO 
CO 


CD 
CO 


1940 



137 



EP 11 108 790 A2 



























0) 






















5 
10 


Function 










ATP-dependent helicase 










hypothetical protein 


deoxynucleotide monophospha! 
kinase 










type II 5-cytosoine 
methyitransferase 


type II restriction endonuclease 






hypoiheticai proietn- 




15 


Matched 
length 
(aa) 










CO 

a> 

CD 










CN 
CN 


00 

o 

CN 










CD 

to 

CO 


CO 

m 

CO 






■V 
o 
m 






Similarity 
(%) 










a> 










00 


ID 


















00 




20 










id' 










xr 


CO 










ai 








ID 






Identity 
(%) 




















a> 


1^ 










CN 








CO 














CM 










iri 

CN 


CO 










ai 


ai 
a> 






CN 




Table 1 (continued) 


Homologous gene 










Staphylococcus aureus SA20 
per A 










Streptomyces coelicolor A3(2) 
SCH1 7.07c 


Bacteriophage phi-C31 gp52 










Corynebacterium glutamicum 
ATCC 13032 cgllM 


Corynebacterium glutamicum 
ATCC 13032 cgllR 






Streptomyces coelicolor A3(2) 
SC1A2.16C 














1 


































40 


db Match 










»— 

CO 

<' 

q: 
O 
0- 

d. 
to 










gp:SCH17_7 


> 

XT 

ID 
CN 

t 
CL 










prf;2403350A 


in 

in 
in 
< 

CL 






CD 

O 
CO 
o. 

O) 






n 




CD 

to 


CM 
CO 


CNJ 

CO 


2355 


CO 
ID 

to 


CO 
CO 


in 

CD 


CD 

CNJ 




rx 
o 


ID 
CM 
CM 


2166' 


CO 
CN 


6507! 


1089 


1074 


1521 




1818 


CO 
CO 

T— 


45 


Terminal 
(nt) 


1861225 


1861475 


1861519 


O) 

o> 
r~) 

CN 
CO 
CO 


1865299 


1865822 


1866219 


18667g2 


1867095 


1867874 


1868587 


1868671 


1868927 


1871101 


1871380 


1879400 


1880485 


1882470 


1884220 


1887047 


1887590 


50 


Initial 
(nt) 


1860752 


1861320 


1861842 


1862088 


ID 
^ 
O 
CN 
CD 
CO 


1865265 


1665842 


1866328 


1866832 


1867098 


1867886 


ID 
CJ> 
CO 
1X3 
CO 
CO 


1871092 


1871373 


18778861 


1878312 


1879412 


1883990, 


1884936 


1885230 


1887405 




CO 2 ™, 




5442 


i ^ 

1 ID 


5444 


ID 

in 


lO 
ID 


5447 


CO 

in 


o> 

1 XT 

Is 


o 
in 

ID 


5451 


CM 
ID 

in 


5453 


5454 


to 
to 

in 


5456 


5457 


5458 


5459 


5460 


5461 


55 


SEQ 
NO. 
(DNA) 


! cn 
I .. 


C7i 


1 ^ 


CT) 


ID 


CD 

cn 




CO 
Oi 


.1949 


O 
VD 


cn 


CN 

in 

CI> 


CO 

in 

CT) 


N- 

tn 
o> 


lin 

i ID 

i o> 


CO 

in 
cr> 


1957 


1958 


1959 


1960 


1961 



138 



EP 1 1108 790 A2 



5 
10 


Function 


SNF2/Rad54 helicase-related 
protein 


hypothetical protein 




hypothetical protein 








endopeptidase CIp ATP-binding 
chain B 














nuclear mitotic apparatus protein 




















15 


Matched 
length 

(a.a) 


o 


CO 
CD 




CO 

in 








CN 














1004 




















20 


Similarity 
(%) 


70.0 


56.4 




C3> 








52.5 1 














49.1 






















Identity 

(%) 


to 


33.1 




20.7 








25.3 














20.1 




















25 ^ 

C 

o 

JQ 

35 
40 


Homologous gene 


Oeinococcus radiodurans 
DR1258 


Lactobacillus phage phl-gle 
Rorf232 




Bacillus anthracis pX02-16 








Escherichia coli cIpB 














Homo sapiens numA 




















db Match 


CO 
CD 
O 

o 
at 
< 

Ol 


pir:T13226 




|gp:AF188935J6 








sp:CLPB_ECOLI 














pir:S23647 






















ORF 
(bp) 


to 

CO 


CO 
00 


o 

CO 
CO 


1680 


1206 


1293 


CO 

a> 

CN 


1785 


CN 
CD 


|1113 


CO 
GO 


oo 
cn 


cn 

OO 


OO 
Oi 


2766 


o 
o 

CD 


1251 


CO 

a> 

CO 


t— 


1008 


1659 


OO 
OO 
XT 


cn 

€Ti 
CO 


1509 


45 


Terminal 
(nt) 


1887688 


1888231 


1889859 


1890028 


1891832 


1893388 j 


1894739 j 


1897374 


1899233 


1899804 


1901066 


! 1902955 


1902005 


1903225 


1903113 


1905973 


1906664 


1907965 


1908785 


1909501 


1910642 


1912333 


1913973 


1914725 


50 


Initial 
(nt) 


1888038 


1889094 


1889530 


1891707 


1893037 


o 

00 
CO 

CO 


1897231 


1899158 


1899853 


1900916 


1 1901911 


1901975 


1902883 


1903028 


1905878 


1906572 


1907914 


1908660 


1909498 


1910508 


1912300 


1913820 


1914371 


1916233 1 




SEQ 

NO. 
(a.a ) 


5462 


5463 


CD 

in 


5465 


5466 


5467 


5468 


5469 


5470 


5471 


5472 


5473 


m 


5475 


5476 


5477 


5478 


5479 


o 

CO 

in 


CO • 

in 


5482 


CO 
OO 

m 


XT 
OO 

m 


in 

in 


55 


SEQ 
NO. 
(DNA) 


1962 


1963 


1964 


1965 


1966 


1967 


1968 


1969 


1970 


1971 


1972 


1973 


1974 


m 
cj> j 


1976 


1977 


1978 


1979 


0861 


1981 


1982 


CO 
CO 
CJ> 


CO 

cn 

i 


in 

CO 



139 



0^1. 



1° 

: to 



o 
o 

CD 



CO 
CD 

o 



-si 

o 



O 



cn 
cn 
O 



cn 
to 



m 

—I 

> 



2 2 



o 
o 



cn 



CjJ 

ro 

OD 



o 
o 



cn 
o 



to 
CO 



to 
cn 



CO 
CD 



X 

o 
o> 

CO 



X 2 

GJi< 
^ O 

^ 2. 

< (D 

to ^• 

cn 3 

c" 
cr 



GO 



-D 

o 



to to 
to to I 

to : CO I 



to <D 

to to 
_* o 



H 52 



2. 3 



55 



05. 



5J^ 



cr 



X 
o 
3 
o. 
o 

o 



CD 



Q. 

51. 



p to g. 



-n 
c 

O 

5' 

3 



OP 



CD 



oe 



9S 



01 



Z^Q^L 2011 d3 



11^ i 



I o o 



ro I ro 
o o 



o 
o 
o 



00 
CO 



■o 

6 

CO 

o 
o 

O 



^ as 

83 



(O 
CO 



CA C/> 

o n> 
o 
— » 

a. 

CO 



CD 

5" 



fvJ fO 
o o 
ro I ro 



ro ro 

o o 

ro ro 

on ^ 



cn cn 
cn (ji 
ro ro 



m 



ro 
o 
ro 



cn 
oi 

lO 



to 

O) 

cn 
o 



o 
o 



O 
CO 



o 
o 

CD 



O Q) r-r 

i §. 

cu to 

It 



p 



ro 



"5 3 

2 ° 

O (D 



■o 

cn 

■a 
o 



ro a> 
to ro 



2 S 
z O rn 

> ^ O 



05 



3 

65' 



2. 3 



Q) 



X 
o 
3 
o 



o 
c 



CL 

is 



2| 



IT (D 
Q. 



O 
3 



(7^ 



03 



9^ 



PI 



0/ 



06Z80I. I. da 



CD 



NO 

o 



M NJ 

o o 
CM 



o o 
cn I cn 
u) ro 

cn 
cn 



o 

cn 



ro 
o 



o 



O 



ro ro 

O i o 



00 
fO 
O 

to 



<D CO 

CO — 



ro 
o 



to 

OD 

o 



ro 
o 
lo 



cn 



CO 

oo 
o 

OD 



to 

OS 
CO 

ro 

CO 



(O 

cn 

(O 



05 

in 

CO 



O) 



O) 
CO 



CJI 

cn 

00 



oo 

oo 



(J) 

lO 



ro 



o 

5 



> 

o 



3 

or 

Q)' 
CD 

> 

ca 
CO 
■D 
ro 



ro 

CO 



ro I ro 



CO 



cji cn 
CJI cn 

C*> OJ 

c*> 



z o m 



p o 



m 



-3 5 



2. 3 



FT 



X 
o 
3 
o 
o 



99 



09 



9P' 



OP' 



9e 



o. 



p» CO 



CD 

Ql 



0€ 



08 



Sl- 



ot 



3V06Z801 VdB 



: o 

00 


2077 


2076 


!g 

cn 


ro 
o 


2073 


2072 


2071 



2070 


2069 


2068 


2067 


2066 


2065 


2064 


2063 


2062 


2061 


2060 


2059 


SEQ 
NO. 

(UnlAJ 


5578 


5577 


5576 


5575 


5574 


5573 


5572 


5571 


5570 


5569 


5568 


5567 


5566 


5565 


5564 


5563 


5562 


5561 


5560 


5559 


SEQ 
NO 

(aa.) 


1995294 


1994121 


1992536 


1991620 


1990764 


1990667 


1989605 


1988664 


CO 
CD 
<» 
^ 
00 
CO 


1988383 


1988303 


1987896 


1986590 


1985373 


1985092 


1984387 


1984217 


1983918 


1983611 


1983186 


Initial 
(nt) 


1994608 


1992538 


1991795 


1991189 


1989874 


1991020 


1988778 


1988530 

i 


1988370 


1988589 1 


1987887 


1987507 


1985442 


1985071 


1985364 


1984728 


1984450 


1984181 


1983683 


1983548 


Terminal 
(nt) 


O) 
00 

-nJ 


1584 




t> 
ro 


oo 

CO 


CO 

cn 


CD 

ro 
oo 


CO 

cn 


x^ 


ro 
0 




CO 
CO 

0 


1149 


CJ 

0 

CJ 


ro 

CO 


CO 
X^ 

ro 


ro 

CO 


fO 

a> 


ro 

CO 


CO 

cn 

CO 




V) 

< 

ro 

no 

I— 
cn 


(/> 

O 
CO 
13 

'o 

o 
;o 
G) 
r- 










gp:SCJ11_12 


T3 

n' 
CO 

a> 

o 

00 
oo 
CO 


gsp:R21601 




gsp:R23011 


gsp:R230il 


tn 
< 

CD 
T3 

|— 
cn 
















db Match 


Mycobacterium phage L5 int 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 csp1 










Streptomyces coelicolor A3{2) 
SCJ11.12 


Corynebacterium glutamicum 
orfl 


Brevibacterium lactofermentum 
CGL2005 ISaBi 




Brevibacterium lactofermentum 
CGL2005 ISaBi 


Brevibacterium lactofermentum 
CGL2005 ISaBi 


Mycobacterium phage L5 int 
















Homologous gene 


28.7 


25.0 










31.1 


74.4 


80.7 




70.9 


83.9 


ro 

CO 

a> 
















Identity 
(%) 


56.1 


37.0 










53.7 


00 
CO 


96.8 




CD 


(O 

Xk 


55.9 
















Similarity 
(%) 


ro 
ro 
u> 


CM 
CO 










ro 
o 


CJ 


CO 






(O 


0 
















Matched 
length 
(aa) 


integrase 


major secreted protein PS1 protein 
precursor 










transposase 


insertion element (IS3 related) 


transposition repressor 




transposase (divided) 


transposase (divided) 


integrase 
















Function 



9S 



OS 



SP 



OP 



se 



03 



91- 



01 



9 



3V06Z80I. Id3 



VP I 



Ik3 
' O 

o 
cn 

00 



to 
o 
<o 



cn 



o 
o 

CO 

ro 



o 
o 

CO 



o 



cn 
cn 



ro 
o 
o 

CD 



o 
o 

CO 
CD 



m 
->i 
o 
cn 

CO 

o 



X S 

O 

^ 2- 

< cr 
X o 
fi) 

CD § 
CO ^ 



O 



b 
c 

— t 

I 

o 
o 



CO O) 

o ^ 
ro ro 

m"2. 
to o 
b. 3 

c: 



ro 



->4 

o 



cn 
o 



ro 



ro 



3 a. 
c ro 

s| 

&§; 

^ cn 

en -o 
ro or 

C3 

-o 
ro 



-T 

I ro 



O 

o 
cn 

o 



X S 
< ^ 

CO § 

n ^ 
c 
cr 
ro 



CO 



3- 
O 



o 
ro 



ro 
o 

CO 



cn 
cn 

CO 



ro 
o 
o 
a> 
a> 



to 
o 
o 

2 

Oi 
to 



ro 

CJ 



m 
-J 
ro 
ro 

CO 
00 



gl 

o 

Q) 

3 



ro 
o 
to 
o 



cn 
cn 
to 
o 



ro t ro 
o o 

CD CD 



cn 
cn 
no 
to 



lO 

o 
o 
cn 
c*J 
o 

(O 



ro 
o 
o 

o 
ro 



> 

m 
o 
ro 

O) 
CO 



o 
3 



o 
o 
to 
o 

to 



CO 

c*> 



ro 

C3> 
CO 



X 
> 

m 



cn 
cn 

CD 
00 



o 
o 



to 
o 
o 
to 



C3> 

to 



to 
o 

CD 

-J 



cn 

OD 



ro 
o 
o 

ro 



to 
o 
o 
o 
cn 
ro 



to 



> 
3 



ro 

0} 
to 

ro 



00 



w 

CO 



ro 

6 



■a 
rr 
o 

3r 



o ^. 

CO c 
CL d 



lO 

cn 
CO 



cn 
lo 



o 

3 
C 



O 

o 
cn 
ro 

00 



X S 

CO v< 
•>! O 

ro 

cn 

OD ^ 



cn 
cn 



fO 

o 



X 

-si 

o 

CO 

a> 

CO 



X 2 

CO v< 

r> 

^ 2- 
< 

iS5 

cn 3 
o ^ 



cn 
cn 



hypol 


hypol 


[heti 


theti 


ical 


ical 


prol 


-0 
0 


ro 


ro 


3" 


3' 



ro 
o 

CD 



cn 

03 



ro 
o 
00 



cn 
cn 



CO 

to 
00 
ro 



CO 

cn 



O 
o 

CO 

cn 

OD 



Mycobacterium 
H37RV RV2673 


Mycobj 
H37RV 


scterium 
Rv2671 


tubei 


tubei 
ribD 


rcul 


rcul 


!so| 


■osi 


V* 


CO 



ro 



3 
ro 

3 
cr 



T3 

o_ 
ro 



ro to to 
o o i o 
03 t 00 ; C30 
(O 1 o 



cn 



cn cn 
cn cn 

CO CD 

o 



CO 

CO 



to 

CO 



o 



a' 



T3 
O 

ro 

3' 



"=> z J2 
503 



cn 
cn 

CO 



'S^' 2 S 
-OS 



CO 
CO 

cn 
o 

CP 

00 



CO 
(O 

tn 



CO 

o 

cn 



CO 



S9 



ro 

^ 3 



X X 




ro 








§8" 


X 




0 


ro 


3 




0 


pyl 


log 


0 


0 


3. 


c 


ro 


tA 


CD 


to 


o> 


ro 


CO 


3 


cn 


ro 



09 



OP 



9€ 



o. 



^ ft) QJ 
" O 

3- ro 
o. 



-n 
c 

3 

n 



ay 

CD 



CD 
Q. 



0€ 



OS 



9t. 



ot. 



2V 



06Z80I. i. da 



St' I. 



2113 


2112 


2111 


2110 


2109 


2108 


2107 


2106 


2105 


2104 


2103 


12102 


2101 


2100 


2099 


2098 


2097 


SEQ 

NO. 

(ONA) 


5613 


5612 


5611 


5610 


5609 


5608 


5607 


5606 


5605 


5604 


5603 


5602 


5601 


5600 


5599 


5598 i 


5597 


SEQ 
NO. 
(a.a.) 


2026494 


2025423 


2025270 


i 2022959 


1 2022546 


2022266 


2020293 


2018744 


2018202 


2018119 


2017966 


2016121 


2015496 


2011863 


2010555 


2010539 


2009570 


initial 
(nt) 


2029043 


2026379 


2023948 


2023945 


2022313 


2022949 


2020724 


2020276 


2017966 


2018754 


2016257 


2015585 


2014162 


2013356 


2011382 


2009724 


2009280 


Terminal 
(nt) 


2550 


CO 

cn 


1323 


CO 
CD 


ro 

CaJ 


Oi 
oo 


CO 

ro 


1533 


fO 
CO 


CJ> 
CO 

Oi 


1710 


cn 

CO 

-n1 


1335 


1494 


CD 

ro 

OD 


CD 

cn 


ro 

CO 


^ ~n 


sp:MTR4_YEAST 


ptr:E70532 




sp:GALE_BRELA 


GP:AF010134_1 


pir:l40339 


prf:2204286C 


CO 

■o 

CO 

o 

X 

r 

00 


pir:G70531 


pir:H70531 


sp:Y065_MYCTU 




sp:YRKO_BACSU 


prf:2204286A 


cn 

■q 
TJ 
"0 
O 

-< 
o 

H 

c 


sp:SUHB_ECOLI 


pir:F70530 


db Match 


Saccharomyces cerevisiae 
YJL050Wdob1 


Mycobacterium tuberculosis 
H37RV Rv2714 




Corynebacterium glutamicum 
ATCC 13869 (Brevibacterium 
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extragenic suppressor protein 
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2072066 
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2085879 


2085436 


2082932 


2082105 


2082813 


2080387 


2077122 


2076392 


2073294 


2071799 


2072878 


2071740 


2071599 


2070519 


2069997 


2069616 


2068556 


2069392 


Terminal 
(nt) 


<D 
CO 


to 
o> 


2259 


to 

CJ) 

4^ 


CJl 
CO 
CO 


cn 
o 


2154 


(n 

CO 
CO 


2763 


1107 


00 
CO 




ro 

00 

cn 


O) 

o 

CO 


cn 
Oi 


CO 

ro 
k. 


CD 

to 

CO 


o> 

CO 
C3 


IS 


prf:2518365A 


pir:F69700 


ro 
ro 

CJ 

> 






sp:YDAP_BRELA 
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Leishmania major 


Bacillus subtilis rpsO 


Streptomyces antibioticus gpsI 






Corynebacterium glutamicum 
(Brevi bacterium lactofermentum) 
ATCC 13869 orf2 


Corynebacterium glutamicum 
ATCC 13032 orf4 


Streptomyces coelicolor A3(2) 
SC4G6.14 


Bacillus subtilis 168 spolllE 


Escherichia coli terC 




Streptococcus pneumoniae 
DBL5 pspA 


Arabidopsis thatiana 
ATSP:T 161 18.20 


Streptococcus pyogenes pgsA 


Streptococcus pneumoniae R6X 

^cinA 


Mycobacterium tuberculosis 
H37RV Rv2745c 


Mycobacterium tuberculosis 
H37Rv RV2744C 


Mycobacterium tuberculosis | 


Homologous gene 
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nucleoside hydrolase 


30S ribosomal protein S15 


guanosine pentaphosphate 
synthetase 






hypothetical protein 


hypothetical protein 


hypothetical protein 


stage III sporulation protein E 


tellurite resistance protein 




surface protein (Peumococcal 
surface protein A) 


hypothetical protein 


phosphotidylglycerophosphate 
synthase 


competence damage induced 
proteins 


regulator (DNA-binding protein) 


hypothetical protein (35kO protein) 


hypothetical protein 
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2100240 
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2097179 
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2093046 


2092055 
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2089868 


2088181 


2087973 


2087941 
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2101841 
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Mycobacterium tuberculosis 
H37RV Rv3663c dppD 


Bacillus subtilis spoOKC 


Escherichia coli K12dppB 


Bacillus subtilis 168 dppE 


Mycobacterium tuberculosis 
H37RV Rv2842c 




Bacillus subtilis 168 nusA 


Streptomyces coelicolor A3(2) 
SC5H4.29 


Stigmatella aurantiaca DW4 infB 


Bacillus subtilis 168 rbfA 


Mycobacterium tuberculosis 
H37RV Rv2837c 


Mycobacterium tuberculosis 
H37RV Rv2836c dinP 


Mycobacterium tuberculosis 
H37RV Rv2795c 


Streptomyces coelicolor A3(2) 
SC5A7.23 


Corynebacterium 
ammoniagenes 


Bacillus subtilis 168 truB 


Corynebacterium 
ammoniagenes ATCC 6872 ribF 


Homologous gene 
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peptidetransport system ABC- 
transporter ATP-binding protein 


oligopeptide permease 


peptidetransport system permease 


peptide-binding protein 


hypothetical protein 




n-utilization substance protein 
(transcriptional 

termination/antitermination factor) 
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translation initiation factor lF-2 


ribosome-binding factor A 


hypothetical protein 


DNA damaged inducible protein f 


phosphoesterase 


hypothetical protein 


hypothetical protein 


tRNA pseudouridine synthase B 


bifunctional protein (riboflavin kinase 
and FAD synthetase) 
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Function 


hypothetical protein 


site-specific recombinase 


hypothetical protein 


Mg(2+) chelatase family protein 


hypothetical protein 


hypothetical protein 


ribonuclease HII 




signal peptidase 


Fe-regulated protein 




SOS ribosomal protein L19 


thiamine phosphate 
pyrophosphorylase 


oxidoreductase 


thiamine biosynthetic enzyme thiS 
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thiamine biosynthetic enzyme thiG 
protein 


molybdopterin biosynthesis protein 
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2-oxoglutarate/malate translocator 
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Function 


hypothetical protein 


peptidase 


sucrose transport protein 






maltodextrin phosphorylase/ 
glycogen phosphorylase 


hypothetical protein 


prolipoprotein diacylglyceryl 
transferase 


indole-3-glycerol-phosphate 
synthase / anthranilate synthase 
component II 


hypothetical membrane protein 
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35 



40 



45 
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Function 


DNA polymerase til epsilon chain 




maltooligosyl trehalose synthase 


hypothetical protein 










alkanal monooxygenase alpha chain 


hypothetical protein 




maltooiigosyltrehalose 
trehalohydrolase 


hypothetical protein 


threonine dehydratase 






Corynebacterium glutamicum AS019 


DNA polymerase III 


chloramphenicol sensitive protein 


histidine-binding protein precursor 


hypothetical membrane protein 
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-> 

'jE 


Archaeoglobus fulgidus AF2388 


Homologous gene 


Streptomyces coelicolor A3( 
SCI8.12 




Arthrobacter sp. Q36 treY 


Deinococcus radlodurans 
DR1631 










Photorhabdus luminescens 
ATCC 29999 tuxA 


Streptomyces coelicolor A3i 
SC7H2.05 




Aflhrobacter sp. Q36 treZ 


Bacillus subtilis 168 


Corynebacterium glutamicui 
ATCC 13032 ilvA 






Catharanthus roseus metE 


Streptomyces coelicolor A3( 
dnaE 


Escherichia coli K12 rarD 


Campylobacter jejuni DZ72 
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S65769 


<o' 

O 
O 
CM 
O 
O 
LU 
< 










LXA1_PH0LU 


in 

cn' 
X 

O 
(/) 




S65770 


YVYE_BACSU 


_j 
CD 

O 

o 
^1 

Q 
X 






S57636 


2508371A 


RARD_EC0L1 


LU 

< 

-o 
t/3 

X 


D69548 




CL 

cn 




ex 


CL 
Ol 










sp. 


CL 
O) 




'q. 


sp: 


:ds 






pir: 


nrf 

prr. 


sp: 


CL 

in 


pir; 


si 


1143 


O 
CO 


2433 


1023 


cn 

C7> 

ro 


oo 

at 


cn 

CO 


1056 


1044 


00 

r»- 
ro 


CO 
CN 


1785 


in 

CO 


1308 


O 

in 


CD 

m 


1203 


3582 


o 

OO 
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CD 
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Terminal 
(nt) 


2234070 


2234763 


2237284 


2238353 


2238694 


2239845 


2240058 


2239508 


2241724 


2241738 


2242129 


2244819 


2242393 


2244864 


2246892 


2246295 


2247006 


2248358 


2252856 


2253659 


2254642 


Initial 
(nt) 


2232928 


2234158 


2234852 


2237331 


2239092 


2240042 


2240246 


2240563 


2240681 


22421151 


2242359 


2243035 


2243043 


CO 

CN 
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2253725 
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Function 


short chain dehydrogenase or 
general stress protein 


diaminopimelate (DAP) 
decarboxylase 


cysteine synthase 




ribosomat large subunit 
pseudouridine synthase D 


lipoprotein signal peptidase 




oleandomycin resistance protein 




hypothetical protein 


L-asparaginase { 


ONA-damage-induclble protein P 


hypothetical membrane protein | 


transcriptional regulator 




hypothetical protein 


isoleucyl-tRNA synthetase 
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47.6 


CO 

to 




61.0 
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57.6 


62.0 
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61.5 
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67.0 


65.4 
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22.9 


32.8 




36.5 


33.8 




36.4 ; 




36.7 


31.2 1 


31.8 1 


31.5 


CO 




42.0 


38.5 






25 

y — s 

•o 

o 
o 

30 ^ 

03 

35 
40 
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Bacillus subtilis 168 ydaD 


Pseudomonas aeruginosa lysA 


Alcaligenes eutrophus CH34 
cysM 




Escherichia coli K12rluD 


Pseudomonas fluorescens NCIB 
10586 IspA 




Streptomyces antibioticus oleB 




Rhodococcus erythropolis orf17 


Bacillus licheniformis 


Escherichia coli K12 dinP 


Escherichia coli K12ybiF 


Streptomyces coelicolor A3(2) 
SCFSI.Oe 




Streptomyces coelicolor A3(2) 
SGF51.05 


Saccharomyces cerevisiae 
A364A YBL076CILS1 
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Initial 
(nt) 
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Function 


hypothetical membrane protein 


3-deo)cy-D-arabino-heptulosonate-7- 
phosphate synthase 


hypothetical protein 


hypothetical membrane protein 


major secreted protein PS1 protein 
precursor 






hypothetical membrane protein 


acyltransferase 


glycosyl transferase 


protein P60 precursor (invasion- 
associated-protein) 


protein P60 precursor (Invasion- 
associated-protein) 


ubiquinol-cytochrome c reductase 
cytochrome b subunit 


ubiquinol-cytochrome c reductase 
iron-sulfur subunit (Rieske (eFe-2Sl 
iron-suffur protein cyoB 


ubiquinol-cytochrome c reductase 
cytochrome c 
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length 
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Identity 
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66.9 


58.4 


35.1 


28.2 
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1 
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50.1 


26.4 


33.0 


CO 
CO 


37.9 


58.6 


Homologous gene 


Mycobacterium tuberculosis 
H37RV RV2181 


Amycolatopsis mediterranei 


Mycobacterium leprae 
MLCB268.21C 


Mycobacterium tuberculosis 
H37RV RV2181 


Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
j 17965 csp1 






Corynebacterium glutamicum 
ATCC 13032 


Corynebacterium glutamicum 
ATCC 13032 


Streptomyces coelicolor A3(2) i 
SC6G1 0.05c 


Listeria ivanovli iap 


Listeria grayi lap 


Heliobaciilus mobilis petB 


Streptomyces lividans qcrA 


Mycobacterium tuberculosis 
H37RV Rv2194 qcrC 
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CO 
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2307621 


2307697 


2309173 


2312252 


2313608 


' 2314036 


2313916 


2314236 


2315678 


2317633 


2318804 


2319968 


2321472 


2323088 


2324311 


Initial 
(nt) 


2306314 


2309082 


2309676 


2309835 


2312360 
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2313833 


2314092 


2315423! 


2316412 
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2319850 


2320594 


2323073 


2323759 


2325195 


w 2 ^ 


5887 


5888 


5889 
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5895 
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5900 
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NO. 
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Function 


cytochrome c oxidase subunit III 




hypothetical membrane protein 


cytochrome c oxidase subunit II 


glutamine-dependent 
amidotransferase or asparagine 
synthetase (lysozyme insensitivity 
protein) 


hypothetical protein 


hypothetical membrane protein 


cobinamide kinase 


nicotinate-nucleotide- 
dimethylbenzimidazole 
phosphoribosyltransferase 


cobalamin (5-phosphate) synthase 




clavulanate-9-aldehyde reductase 


branched-chain amino acid 
aminotransferase 


leucyl aminopeptidase 


hypothetical protein 


dihydrolipoamide acetyttransferase 




Iipoyltransferase 


Matched : 
length 
(aa) 
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CO 
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Identity 
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28.7 1 
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CO 


43.0 


37.8 


25.3 




38.6 


o 


36.3 


40.2 


48.9 




CO 
CO 


Homologous gene 


Synechococcus vulcanus 




Mycobacterium tuberculosis 
H37RV Rv2199c 


Rhodobacter sphaeroides ctaC 


Gorynebacterium glutamicum 
KY9611 ItsA 


Corynebacterium glutamicum 
KY9611 orfl 


Mycobacterium leprae 
MLCB22.07 
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lipoic acid synthetase 


hypothetical membrane protein 


hypothetical membrane protein 


Iransposase (!SCg2) 




hypothetical membrane protein 




mutator mutT domain protein 


hypothetical protein 
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heme oxygenase 


glutamate-ammonia-ligase 
adenylyltransferase 


glutamine synthetase 


hypothetical protein 


hypothetical protein 


hypothetical protein 


galactokinase 


virulence-associated protein 




bifunctiona! protein (ribonuclease H 
and phosphoglycerate mutase) 




hypothetical protein 
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transcriptional regulator 




hypothetical protein 




pyruvate dehydrogenase component 




ABC transporter or glutamine 
transport ATP-binding protein 




ribose transport system permease 
protein 


hypothetical protein 


calcium binding protein 
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acyl carier protein 
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hypothetical protein 


hypothetical protein 




glycyl-tRNA synthetase 


bacterial regulatory protein, arsR 
family 


ferric uptake regulation protein 


hypothetical protein (conserved in 
C.glutamicum?) 


hypothetical membrane protein 


undecaprenyl diphosphate synthase 


hypothetical protein 


Era-like GTP-btnding protein 


hypothetical membrane protein 


hypothetical protein 


Neisserial polypeptides predicted to 
be useful antigens for vaccines and 
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protein 


hypothetical protein 
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heat shock protein dnaJ 


heat-inducible transcriptional 
repressor (groEL repressor) 


oxygen-independent 
coproporphyrinogen III oxidase 


agglutinin attachment subunit 
precursor 
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carboxylesterase | 


glycosyl hydrolase or trehalose 
synthase 


hypothetical protein 


15 


Matched 
length 
(a.a) 


o 

CO 
CO 


CD 
CO 


o 

CN 
CO 


CO 






CO 


CO 
CO 


O 
CO 


CO 
CO 


o 






o 

CJ) 
CD 


CO 

in 


m 




20 


Similarity 
(%) 


ru 


79.6 


64.1 


cn 

CO 






75.1 


55.4 


64.4 


51.0 


53.0 






68.3 


45.7 


84.9 


58.8 ' 




Identity 
(%) 


47.1 


CN 

od 


33.1 


36.6 






48.0 


28.3 


29.5 


44.0 


o 






40.3 


24.1 


65.2 


32.1 


Table 1 (continued) 


Homologous gene 


streptomyces albus dnaJ2 


Streptomyces albus hrcA 


Bacillus stearothermophilus 
hemN 


Saccharomyces cerevisiae 
YNR044W AGA1 






Streptomyces coelicolor A3(2) 
SG6Gia04 


Escherichia colt K12malQ 


Lactobacillus brevis plasmid 
horA 


Neisseria gonorrhoeae 


Neisseria meningitidis 






Salmonella typhimurium dcp 


Anisopteromatus calandrae 


Mycobacterium tuberculosis 
H37RV RV0126 


Mycobacterium tuberculosis 
H37RV Rv0127 


40 


db Match 


CD 
CM 

■c- 

CO 
CN 
CN 

f 
CL 


prf.2421342A 


prf:2318256A 


sp:AGA1_YEAST 






5 

CO 

O 

CO 

CL 
O) 


sp:MALQ_ECOLI 


gp:AB005752_1 


GSP:Y74827 


Oi 
CN 
CO 

> 
CL 






sp:DCP_SALTY 


X— 

1 

CO 
CN 

in 

CD 
O 

!^ 


pir:G70983 


CO 
oo 

Oi 
O 

X 




ORF 

(Dp/ 


1146 


1023 


o 


CD 
lO 


CO 

a> 

CO 


CO 

1^ 

CO 


1845 


2118 


1863 


in 
in 

CN 


CO 

CO 
CO 


o 

CO 


o 

CN 


2034 


1179 


1794 


1089 


45 


Terminal 
(nt) 


2422700 


2423915 


2424965 


2426699 


2426776 


2427807 


2428184 


2432413 


2434370 


2433614 


2433875 


2434440 


2434573 


2434805 


2438049 


2439906 , 


CD 

o> 

O 
CN 


50 


initial 
(nt) 


2423B45 


2424937 




2425954 


2425181 


2427468 


2428184 


1 

2430028 


2430296 


2432508 


2433868 


2434207 


2434619 


2434776 


2436838 


2436871 


2438113 


2439906 




C/D 2 i 


6012 


1 

6013 


6014 


6015 


6016 


6017 


6018 


6019 


6020 


6021 


6022 


6023 


6024 


6025 


6026 


1 — 
6027 


6028 


55 


SEQ 
NO. 
(DNA) 


2512 


2513 


2514 


2515 


2516 


2517 


2518 


2519 


2520 


2521 


2522 


ro 

CN 

in 

CN 


2524 


2525 


2526 


2527 


2528 
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Function 


isopentenyl-diphosphate Delta- 
isomerase 












beta C-S lyase (degradation of 
aminoethylcysteine) 


branched-chain amino acid transport 
system carrier protein (isoleucine 
uptake) 


alkanal monooxygenase alpha chain 




malonate transporter 


glycolate oxidase subunit 


transcriptional regulator 




hypothetical protein 




heme-binding protein A precursor 
(hemin-binding lipoprotein) 


oligopeptide ABC transporter 
(permease) 


dipeptide transport system 
permease protein 


oligopeptide transport ATP-binding 
protein 


Matched 
length 
(aa) 


o> 

00 












in 

CN 
CD 


to 

CN 


CO 
CO 




CN 
CO 


CO 
OO 


CO 

o 

CN 




1^ 
to 




<o 
m 


in 

CO 


CN 


CN 

CO 


Similarity 


57.7 












100.0 


100.0 I 


49.0 




60.5 


55.1 


65.0 




57,6 




55.5 


73.3 


74.5 


65.4 














































31.8 












99.4 


99.8 


21.6 




25.9 


27.7 


25.6 




22.5 




.27.5 


o 


43.2 


37.4 


Homologous gene 


Chlamydomonas reinhardtii ipi1 












Corynebacterium glutamicum 
ATCC 13032 aecD 


Corynebacterium glutamicum 
ATCC 13032 brnQ 


Vibrio harveyi luxA 




Sinortiizobium meliloti mdcF 


Escherichia coli K12 glcD 


Escherichia colt K12 ydfH | 




Salmonella typhimurium ygiK 




Haemophilus influenzae Rd 
H 10853 hbpA 


Bacillus subtilis 168 appB 


Escherichia coli K12 dppC 


Escherichia coli K1 2 oppD 


db Match 


pir:T07979 












«' 

>- 
—1 
O) 

o 
cr 
o 
o 

Q. 
O) 


sp:BRNQ_CORGL 


< 

X 
CD 
> 

s' 

3 
_t 

CL 
V) 




gp:AF155772_2 


—i 

o 
a 

LU 

o' 

o 

_l 

o 

CL 
tn 


-J 
O 
o 

LU 

x' 

U- 

Q 

> 




sp.YGIK.SALTY 




LU 
< 
X 

CD 
X 

id. 
w 


sp:APPB_BACSU 


_i 
O 
o 

UJ 

o' 

Q- 
Q- 
Q 

id. 

CO 


prf:2306258MR 


ORF 

(Dp) 


LO 
CO 

m 


CN 
CN 
CN 


CO 
CO 

-^r 


1755 


o 
to 
to 


cn 
m 


in 

Oi 


1278 


oo 

r- 

O) 


CN 
CN 

m 


CN 

a> 


2844 




CN 

CO 
CN 


1347 


CO 
CN 


1509 


to 
to 
cn 


oo 

CN 
CD 


1437 


Terminal 
(nt) 


2441005 


2441890 


2442792 


2441602 


2443356 


2444033 


2445709 


2446993 


2447998 ' 


2450323 


2450859 


2451794 


2455435 


2455452 


2455720 


2457337 


2459371 


2460336 


2461167 


2462599 


Initial 
(nt) 


2441589 


2441669 


2442355 


2443356 , 


2444015 


in 
in 

CN 


2444735 


2445716 


2447021 


2450844 


2451785 


2454637 


2454725 


2455733 


2457066 


2457759 


2457863 


2459371 


2460340 


2461163 


SEQ 

NO. 
(a.a.) 


6029 


6030 


6031 


6032 


6033 


6034 


6035 


6036 


6037, 


6038 


6039 


6Q40 


6041 


6042 


6043 


6044 


6045 


6046 


6047 


6048 


SEQ 

NO. 


2529 


a 
n 


2531 


2532 


2533 


2534 


2535 


2536 


2537 


2538 


2539 


2540 


2541 


CN 

in 

CN 


2543 


2544 


2545 


2546 


2547 


2548 
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Function 


hypothetical protein 


hypothetical protein 


ribose kinase 


hypothetical membrane protein 




sodium-dependent transporter or 
odium Bile acid symporter family 


apospory-associated protein C 




thiamine biosynthesis protein x 


hypothetical protein 


glycine betaine transporter 








large Integral C4-dicarboxylate 
membrane transport protein 


smalt integral C4-dicarboxytate 
membrane transport protein 


C4-dicarboxylate*binding 
periplasmic protein precursor 


extensin 1 


GTP-binding protein | 


15 


Matched 
length 
(aa) 


CO 

o 


LO 


o 
o 

CO 


CO 
CO 




00 
CN 


tn 

O) 
CN 




CO 
CO 


CD 


^- 

o 

CO 








CO 


CO 


CN 
CN 


CD 


CO 

o 

CO 




















































o 


co 




CO 


CN 




o 


tn 










cn 




C3 


o 


CO 


20 






58. 


CO 


CO 




CO 


m 




o 
o 


ih 

CD 












co 


<?> 

m 


CO 


CO 
CO 




Identity 
(%) 


o 


CO 


o 


Oi 




CO 


m 




100.0 


CO 


CO 








CO 


Oi 


CN 


o 


N- 




iri 
n 


oi 

CN 




C3> 
CO 




CO 


od 

CN 




CN 


CO 








CO 


CO 
CO 


CO 
CN 


CO 
<D 


CO 

m 


25 

Z3 
C 

*c 
o 
o 

30 ^ 
35 


Homologous gene 


Aeropyrum pernix K1 APE1580 


Aquifex aeolicus VF5 aq_768 


Rhizobium etli rbsK 


streptomyces coelicolor A3(2) 
SCM2.16C 




Homo sapiens 


Chlamydomonas reinhardtii 




Corynebacterium glutamicum 
ATCC 13032 thiX 


Mycobacteriophage D29 66 


Corynebacterium glutamicum 
ATCC 13032 betP 








Rhodobacter capsulatus dctM 


Klebsiella pneumoniae dctQ 


Rhodobacter capsulatus BIO 
dctP 


Lycopersicon esculentum 
(toniato) 


Bacillus subttlis 168lepA 


40 


db Match 


PIR:G72536 


plr:D70367 


< 

o 

CO 

LO 
CN 

t: 

CL 


gp:SCM2J6 




i 

X 

1 

o 
1- 

n. 


^* 
■a- 

CN 

in 
o> 

< 
CL 

cn 




_j 
C9 

O 

o 

X 

cL 
(/) 


Q 

Q_ 
CD 

«>' 

to 

cL 
</) 


-J 
o 
q: 
O 
o 

a' 

UJ 
CD 

CL 








prt;2320266C 


^1 
o 

CO 
CO 

LL 
< 

O) 


5 

o 

X 

q: 

Q.' 

O 
Q 

Q- 
(/) 


PRF:1806416A 


sp:LEPA_BACSU 




si 


o 

to 


C7> 

m 


fO 

o 

CJ> 


1425 


cn 
a 

CO 


CN 

a> 


to 
-<r 
oo 


to 

CO 
CO 


o 
in 


oo 

oo 
to 


1890 


CO 

cn 

O) 


1608 


•V 

CO 
CO 


1311 


o 
oo 


N> 
NT 


CO 
CN 


1845 


45 


Terminal 
(nt) 


2461543 


2462602 


CO 
■T 

to 

CN 


2465768 


m 

CD 

m 

CO 
CN 


2466038 


2467922 1 


2470678 


2472819 i 


2472893 


2475542 


2477492 1 


2479251 


2479762 


2479898 


2481213 


2481734 


2484087 


2482548 


50 . 


Initial 
(nt) 


2462049 


2463150 


2463241 


2464344 


2465767 


2467009 


2467077 


2470313 


2472250 


2473480 


CO 

m 

CO 
CO 

CN 


2476497 


2477644 


2479379 


2481208 


2481692 


2482480 


tn 

CO 
CO 
CO 

CN 


CN 
CD 

n 

'0- 
CO 

^ 

CN 




CO 2 «. 


6049 


6050 


6051 


6052 


6053 


6054 


6055 


6056 


6057 


6058 


6059 


6060 


6061 


6062 


6063 


6064 


6065 


9909 


6067 


55 


SEQ 

NO. 
(DMA) 


2549 


2550 


2551 


CN 

I in 
1 ^ 
1 ^ 


2553 


2554 


2555 


2556 


2557 


2558 


2559 


2560 


2561 


2562 


2563 


2564 


2565 


2566 


2567 
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Function 


hypothetical protein 


30S ribosomal protein S20 


thrreonine efflux protein 


ankyrin-like protein 


hypothetical protein 


late competence operon required for 
DNA binding and uptake 


late competence operon required for 
DNA binding and uptake 




hypothetical protein 


phosphoglycerate mutase 


hypothetical protein 


hypothetical protein 




gamma-glutamyl phosphate 
reductase or glutamate-5- 
semialdehyde dehydrogenase 


D-isomer specific 2-hydroxyacid 
dehydrogenase 




GTP-binding protein 


15 


Matched 
length 
(aa) 


m 

00 


in 
oo 


o 


CD 
CN 


cn 
cn 


CN 

in 


to 

CO 




CO 
CN 


in 
cn 

CN 




o> 




CN 

cn 


o 

CO 




CO 








































20 


Similari 


69.7 


72.9 


67.1 


80.6 


74.1 


49.7 


63.6 




66.3 


66.4 


86.3 


85.3 




99.8 


100,0 




78.2 




Identity 
(%) 


41.6 


CO 


30.0 


61.2 


46.0 


21.4 


30.8 




34.8 


46.8 


55.6 


68.0 




99.1 


99.3 




58.9 


Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37RV RV2405 


Escherichia coli K12 rpsT 


o 

CN 

"6 
o 

ra 

IE 
o 

0) 

xz 
o 
cn 

LU 


Streptomyces coelicolor A3(2) 
SC6D7.25. 


Mycobacterium tuberculosis 
H37RV Rv2413c 


Bacillus subtilis 168 comEC 


E3acil(us subtilis 166 comEA 




Streptomyces coelicolor A3(2) 
SCC123.07C. 


Mycobacterium tuberculosis 
H37RV RV2419C 


Mycobacterium tuberculosis 
H37Rv Rv2420c 


Streptomyces coelicolor A3(2) 
SCC123.17C. 




Corynebacterium glutamicum 
ATCC 17965 proA 


Corynebacterium glutamicum 
ATCC 17965 unkdh 




Streptomyces coelicolor A3(2) 
obg 


35 
40 


db Match 


pir:H70683 


-J 
O 

o 

LU 

o' 

CN 
CO 

(£. 

iCL 
i/> 


sp:RHTC_ECOLl 


m 
Q 

CO 

O 
CO 

d. 
o> 


pir.H70684 


sp:CME3_BACSU 


3 
CO 

CD 

t 

It— 
LU 

O 
cL 

t/i 




CN 
O 

o 
CO 
cL 
cn 


in 
oo 

(O 

o 

UL 
"CL 


in 

CO 

<o 
o 

9. 
"q. 


gp:SCC123J7 




_j 

CD 

O 
o 

<' 
o 
q: 

Q- 

o. 
<n 


sp:YPRA_CORGL 




CJ) 

oo 
Q 

CL 
O) 




u 


cn 
o 

CO 


r— 
CO 
CM 


CD 
CO 


m 
o 


in 


1539 


CN 
CO 

in 


CN 
CN 
OO 


CN 
CN 
00 


CO 

o 


^ 


00 
CO 


1023 


1296^ 


CN 
O) 




1503 


45 


Terminal 
(nt) 


2485269 


2485733 


2485801 


2486477 


2486910 


2487912 


2489573 


2491732 


2490290 


2491151 


2491873 


2492501 


2493215 


2494339 


2495696 


2497513 


2498009 


50 


Initial 
(nt) 


2484661 


2485473 


2486469 


2486881 


2487884 


2489450 


2490154 


2490911 


2491111 


2491858 


2492343 


2493178= 


2494237 


2495634 


2496607 


2496803 


2499511 




NO 
(aa.) 


6068 


6069 


6070 


6071 


6072 


6073 


6074 


6075 


6076 


6077 


6078 


6079 


6080 


CO 

o 

CD 


6082 


6083 


6084 


55 


SEC 
NO. 
(DNA) 


2568 


2569 


2570 


2571 


2572 


2573 


2574 


2575 


2576 


2577 


2578 


2579 


2580 


2581 


2582 


2583 


2584 
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Function 


xanthine permease 


2,5-diketo-D-gluconic acid reductase 






50S ribosomal protein L27 


SOS ribosomal protein L21 


ribonuclease E 








hypothetical protein 


transposase (insertion sequence 
IS31831) 


hypothetical protein 


hypothetical protein 


nucleoside diphosphate kinase 




hypothetical protein 


hypothetical protein 


hypothetical protein 




15 


Matched 
length 
(a.a) 


fM 
CM 


CD 

r*- 

CM 






oo 




1 886 








in 


CD 
CO 




CO 


CO 




CN 
(3) 


CN 


CO 




20 


Similarity 
(%) 


CO 


81.9 






92.6 


82.2 


56.6 








82.6 


100.0 


76.9 


CD 


89.6 




67.4 


64.3 


68.6 






Identity 
(%) 


39.1 


61.2 






80.3 


56.4 


30.1 








61.0 


99.1 


51.3 


37.8 


70.9 




GO 
CO 


36.6 


33.9 




25 

o 
o 

30 ^ 
a> 

S3 


Homologous gene 


Bacillus subtilis 168 pbuX 


Corynebacterium sp. ATCC 
31090 






Streptomyces griseus IF013189 
rpmA 


Streptomyces griseus IF013189 
obg 


Escherichia coliK12rne 








Streptomyces coelicolor A3(2) ! 
SCF76.08C 


Corynebacterium glutamicum 
ATCC 31831 


Streptomyces coelicolor A3(2) 
SCF76.08C 


Streptomyces coelicolor A3(2) 
SCF76.09 


Mycobacterium smegmatis ndk 




Oeinococcus radiodurans R1 
DR1844 


Mycobacterium tuberculosis 
H37RV Rv1883c 


Mycobacterium tuberculosis 
H37Rv/ Rv2446c 




35 












































40 


db Match 


sp;P8UX BACSU 


pir 140838 






tr 

O 

}-~ 
in 

r^' 

CM 
—1 

QL 

a. 
cn 


prf:2304263A 


—» 
o 
o 

LU 
Hi' 

q: 

CL 
(/) 








CO 

u. 
O 

in 

CL 
Ol 


pir:S43613 


gp:SCF76_8 


cn 

1 

CD 

u. 
O 
CO 

id 

O) 


O) 
CO 

o 

o. 
cn 




o 

<N 
O 
CN 
O 

o 

C7> 


pir:H70516 


pir.E70863 






ORF 
(bp) 


1887 


CO 

oo 


CM 
(O 


CD 

cn 

CO 


rr 

CD 
CM 


CO 

o 

CO 


2268 


cn 
in 


CO 

r- 
in 




O) 
O 
CD 


1308 


GO 
CO 


o 
in 


GO 

o 


o 

CO 
CO 


CN 
CO 


in 

CD 


CO 
CN 
-T 




45 


Terminal 
(nt) 


2601669 


2501735 


2503355 


2504265 


2503984 


2504300 


2504831 


2507663 


2507710 


2508840 ' 


2509530 1 


2509523 


2511423 


2511876 


2511949 


2512409 


2513144 


2513154 


2513692 




50 


Initial 
(nt) 


2499783 


2502577 


2502735 


2503870 


2504247 


2504602 


2507098 


2507115 


2507138 


2508094 


2508922 


2510830 


2511046 


2511427 


2512356 


2512768 


2512803 


2513616 


2514114 


1 




CO 2: ^ 


fiORfS 


9809 


oo 
a 

(O 


fiOfifi 


6089 


6090 


6091 


6092 


6093 


6094 


6095 


6096 


6097 


6098 


6099 


6100 


6101 


6102 


6103 


55 


o fS i 

j C/3 Z C 


" tn 

O} 
IT 
I CN 


2586 


OC 
ID 
CN 


CO 

CC 

ir 

c\ 


2589 


2590 


C31 

in 

c\ 


c\ 
cr 
\n 

CN 


CO 

• cn 
in 

CM 


XT 

Ol 
Ul 


2595 


2596 


2597 


2598 


2599 


o 
o 

CO 
CM 


2601 


2602 


2603 
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Function 


folyl-polyglutamate synthetase 








valyl-tRNA synthetase 


oligopeptide ABC transport system 
substrate-binding protein 


heat shock protein dnaK 


lysine decarbo)(ytase 


malate dehydrogenase 


transcriptional regulator 


hypothetical protein 


vanillate demethylase (oxygenase) 


pentachlorophenol 4- 
monooxygenase reductase 


transport protein 


malonate transporter 


class-Ill heat-shock protein or ATP- 
dependent protease 


hypothetical protein 


succinyl CoA;3-oxoadipate CoA 
transferase beta subunit 


succinyl CoA'3-oxoadipate CoA 
transferase alpha subunit 


15 


Matched 
length 
(aa) 










CO 
O) 


CM 
UO 


CO 

o 
in 


o 
r-- 


CT> 
CD 


o 
CM 


CO 

o 

CM 


in 

CO 


CO 
CO 
CO 




CO 
CO 
CM 


o 

CO 


CD 
CD 
CO 


o 

CM 


tn 

CN 


20 


Similarity 
(%) 


79.6 








72.1 


58.6 


54.9 


71.2 


76.5 


56.5 


51.4 


68.6 


59.2 


76.8 


58.4 


85.8 


73.0 


85.7 


84.5 




Identity 

(%) 


55.4 








45.5 


24.2 


26.2 


42.9 


56.4 


24,6 


26.0 


39.5 


32.8 


40.6 


28.0 


59.8 


. 45.6 


63.3 


60.2 


Table 1 (continued) 


Homologous gene 


Streptomyces coelicolor A3(2) 
folC 








Bacillus subtilis 168 balS 


Bacillus subtilis 168oppA 


EBacillus subtilis 168 dnaK 


Eikenella corrodens ATCC 
23824 


Thermus aquaticus ATCC 33923 
rndh 


Streptomyces coelicolor A3(2) 
SC4A10.33 


Vibrio cholerae aphA 


Acinetobacter sp. vanA 


Sphingomonas flava ATCC 
39723 pcpD 


Acinetobacter sp. vanK 


Klebsiella pneumoniae mdcF 


% 

o 
w 

=3 
</) 

O 

m 


Streptomyces coelicolor A3(2) 
SCF55.28C 


Streptomyces sp. 2065 pcaJ 


Streptomyces sp. 2065 peal 


35 
40 


db Match 


prf:2410252B 




: 




sp:SYV_BACSU 


CO 
CO 

< 


sp:DNAK_BACSU 


gp:ECU89166_1 


sp:MDH_THEFL 


CO 

< 
o 

CO 
CL 


gp:AF065442_1 


prf;2513416F 


gp:FSU12290 2 


O 

CO 

m 

CM 

t* 


gp:KPU95087_7 


< 

CN 
CO 

o 

CO 
CN 

t: 
a. 


gp:SCF55_28 


gp:AF109386_2 


^' 

CO 
CO 

cn 
o 

< 
cL 

O) 




Li_ ^ 


1374 


CSI 

5 


r— 


CO 

to 

CO 


2700 


1575 


1452 


in 

CO 

m 


CO 

cn 




CD 

m 


1128 


tn 
cn 


1425 


o 

CO 
CD 


1278 


1086 


CO 

cn 

CD 


o 
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toluate 1,2 dioxygenase subunit 
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1 ,2-dihydroxycyclohexa-3,5-diene 
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Table 1 (continued) 
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galactose-6-phosphate isomerase 


hypothetical protein 
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aminopeptidase N 


hypothetical protein 








phytoene desaturase 
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acetylornithine aminotransferase 


hypothetical protein 


hypothetical membrane protein 


acetoacetyl CoA reductase 


transcriptional regulator. TetR family 
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in 
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(N 
CD 
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CN 



CN 
CN 



< 

^' 

CO 

—I 
C3 



CN 

to 



CO 
CN 
CO 
CN 



OO 
CN 
CO 



CN 

in 

00 
CO 



CO 

CO 
CO 
CN 
CO 
CN 



CN 
CM 
CO 



CN 
CN 



in 

CN 
CN 
CD 



CO 
CN 
CM 
CO 



in 

CN 



CD 
CM 



LU 



c 
o 

15 



CD 

CN 



in 



o 

CM 



Q. 
O 



M Si 



U 

CM 



a. 



CD 
CO 

CN 
CO 
CO 
CN 



O 

CN 

CO 
CO 
CM 



a: 

c 

'oi 

2 
o. 

2r 
o 

? 

3 
O) 
0) 



Is 

5 Si 

^ > 
o QC 

>>ro 



< 

CN 
CO 
CN 



O 
CO 

o 



o 

CO 
CN 
CO 



O 

CO 

CM 



CO 
CO 
CM 



Ul 



CO 
< 

o_ 
o 
.o 

■«) 
o 

15 

o — 
CJ 
CO C/) 



O 
CO 
a. 



CO 
CO 
CM 



CO 
CD 
CN 



CM 
CO 
CM 
CO 



CO 
CO 
CN 
CO 



CM 
CO 



CM 
CO 
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5 
10 


Function 


phosphopantethiene protein 
transferase 


lincomycin resistance protein 


hypothetical membrane protein 




fatty-acid synthase 


hypothetical protein 


peptidase 


hypothetical membrane protein 


hypothetical membrane protein 


hypothetical protein 


ribonuclease PH 








hypothetical membrane protein 


CO 
CN 
CO 

CO 

o 

CL 
cn 
c 

CO 




arylsulfatase 


15 


Matched 
length 
(aa) 




CO 


CO 

■r— 




3029 


O 


o 

CO 
CN 


CN 


CO 


CN 
O 
CN 


CO 
CO 
CN 








00 
CN 


tn 




o 
in 

CN 










































20 


Similari 


75.9 


85.6 


54.0 




83.6 


55.2 


60.9 


67.9 


69.0 


76.7 


81.4 








j 58.2 


97.2 




74.4 




Identity 
(%) 


56.6 


52.4 


30.1 




62.3 


25.3 


40.4 


40.2 


37.2 


55.0 


60.2 








29.0 


92.1 




o 


25 

o 

C 

c 

30 yr- 

a> 

35 
40 


Homologous gone 


Corynebacterium 
ammoniagenes ATCC 6871 ppti 


Corynebacterium glutamtcum 
ImrB 


1 Synechocystis sp. PCC6803 




Corynebacterium 
ammoniagenes fas 


Streptomyces coelicolor A3(2) 
SC4A7.14 


Mycobacterium tuberculosis 
H37RV Rv0950c 


Mycobacterium tuberculosis 
H37RvRv1343c 


Mycobacterium leprae 
B1549_F2_59 


Mycobacterium tuberculosis 
H37RvRv1341 


Pfieudomonas aeruginosa 
ATCC 15692 rph 








Mycobacterium tuberculosis 
H37RV SC8A6.09C 


Corynebacterium glutamicum 
22243 R-pIasmid pAG1 tnpB 




Mycobacterium leprae ats 


db Match 

1 


OO 

o 

i 

d 


CO 

CO 
(M 
LL 
< 

id. 
cn 


pirS76537 




pir:S2047 


< 

O 
CO 

Ol 

cn 


pir:D70716 


h- 
o 
>- 

o 
> 

CA 


sp:Y076_MYCLE 


? 
o 
>- 

o 

CO 

o 
> 

(A 


LU 
< 
UJ 
CO 
0- 

X 

a. 

Ql 

id. 
CO 








h- 
o 
> 

CN 
O 
> 

CL 
V> 


gp:AF121000_8 




sp:Y03O_MYCLE 




ORF 


in 
o 


1425 


CN 
CO 




8979 


1182 


in 
5 


CN 
CO 


S 

CO 


00 
CO 


tn 

CO 


CD 
CN 


CO 
Oi 
CO 


CN 
CO 

m 


1362 


CO 

in 


o 

CO 
CO 


in 

CD 


45 


Terminal 
(nt) 


2634747 


' 2635165 


2637168 


2637240 


2638649 


2648235 

1 


2650164 


2650902 


2651339 


2651420 


2652067 


2653009 


2653326 


2654079 


2654875 


2656985 


2656974 


2657736 


50 


Inittat 
(nt) 


2635151 


2636589 


in 

CO 
CO 
CO 

to 

(N 


2637653 


i 2647627 


2649416 


2649550 


2650441 


2650986 


2652037 


2652801 


2653254 ' 


2654018 


2654660 1 


2656236 


2656452 


2657633 


2658500 




CO 2 


6235 


6236 


6237 


6238 


6239 


6240 


rr 
• CN 

CO 


6242 


6243 


6244 


6245 


6246 


6247 


6248 


6249 


6250 


6251 


6252 


55 


SEQ 

NO. 
(DNA) 


2735 


2736 


2737 


2738 


2739 


2740 


2741 


2742 


2743 


2744 


2745 


2746 


2747 


CO 

;^ 

CN 


j 2749 


2750 


2751 


CM 

in 

CN 
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=] 

c 



o 

u 



Function 


D-glutamate racemase 




bacteria! regulatory protein. marR 
family 


hypothetical membrane protein 




endo-type 6-aminohexanoate 
oligomer hydrolase 


hypothetical protein 


hypothetical protein 




hypothetical protein 




ATP-dependent helicase 


hypothetical membrane protein 


hypothetical protein 


phosphoserine phosphatase 




cytochrome c oxidase chain 1 




Matched 
length 
(aa) 


OO 
CM 




T— 


in 

CM 
CM 




CM 
CO 


o 
o 

CN 


in 
o 




CO 
CM 




CD 


CO 
CO 


CM 
CM 
CM 


o 

cn 




m 
in 




>* 






































i2 ^ 


CO 

cri 
cn 




70.8 


69.3 




58.3 


58.5 


77.1 




80.8 




53.3 


60.1 


52.0 


61.0 




74.4 




w 






































Identity 
(%) 


99.3 




CM 


38.2 




30.2 


35.0 


57.1 




CM 




25.2 ^ 


29.7 


39.0 


38.7 




46.8 




Homologous gene 


Corynebacterium glutamicum 
ATCC 13869 murl 




Streptomyces coelicolor A3(2) 
SCE22.22 


Mycobacterium tuberculosis 
H37RV Rv1337 




Flavobacterium sp. nylC 


Mycobacterium tuberculosis 
iH37Rv Rv1332 


Mycobacterium tuberculosis 
H37RV Rv1331 




Mycobacterium tuberculosis 
H37RvRv1330c 




Escherichia coli dinG 


Mycobacterium tuberculosis 
H37RV Rv2560 


Streptomyces coelicolor A3{2) 
SCI B5. 06c 


Escherichia coli K12 serB 




Mycobacterium tuberculosis 
H:37Rv Rv3043c 




db Match 


prf:2516259A 




gp:SCE22_22 


sp;Y03M_MYCTU 




cn 

CO 

o 

LJ 
CL 


|Sp:Y03H_MYCTU 


sp:Y03G_MYCTU 




O 
> 

1 

LL 
CO 

> 

■ id 
tn 




prf:1816252A 


sp:Y0A8_MYCTU 


CO 
CO 

CO 


_i 
O 

o 

LU 

m' 

iX. 

LU 

u> 

id. 
w 




pir:D45335 






(M 

in 

00 


CD 
CO 
CO 






! ^ 
■ oy 

CO 


o 

CO 

cr> 


r- 

CO 

in 


o 
o 

CO 


CN 
CO 


1338 


CO 

o 

CO 


1740 


<n 

CO 


CO 
CN 


1017 


1596 


1743 


CO 

o 

CO 


Terminal 
(nt) 


2658606 


2660131 


2660147 


2660671 


2662455 


2661417 


2662331 


2662883 


2664060 


2665397 


2665992 


2667854 | 


2667870 


2668839 


2669557 


2672721 


2671063 


2673255 


Initial 
(nt) 


2659457 


2659496 


2660638 


2661417 


2661565 


2662376 


2662867 


2663182 


2663437 


2664060 


2665687 


2666115] 


2668760 


2669561 


2670573 


2671126 


2672805 


2672950 


U5 2 ^ 


6253 


6254 


6255 


6256 


6257 


6258 


6259 


6260 


6261 


6262 


6263 


6264 


6265 


6266 


6267 


6268 


6269 


o 

CN 
CO 


w ^ 9 


2753 


2754 


2755 


2756 


2757 


2758 


2759 


2760 


2761 


2762 


2763 


2764 


2765 


2766 


2767 


2768 


2769 


o 

CN 
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3 



Function 


ribonucleotide reductase beta-chain 


ferritin 


sporulation transcription factor 


iron dependent repressor or 
diptheria toxin repressor 


cold shock protein TIR2 precursor 


hypothetical membrane protein 


ribonucleotide reductase alpha- 
chain 




SOS ribosomal protein L36 


NH3-dependent NAD(+) synthetase 






hypothetical protein 


hypothetical protein 


alcohol dehydrogenase 


Bacillus subtilis mmg (for mother cell 
metabolic genes) 


hypothetical protein 




phosphoglucomutase 


Matched 
length 
(aa) 


CO 
CO 


Oi 

in 


CO 

in 
OJ 


in 

OJ 
OJ 


CN 
^— 


o 
in 


o 






cn 
r- 
r>4 






in 

CM 


CD 
O) 


1^ 

CO 
CO 


O) 

m 


OO 
CN 




CO 

in 
m 










































Sinr^ilari^ 
(%) 


99.7 

i 


64.2 


60.2 


60.4 


62.1 


86.0 


100.0 




79.0 


78.1 






56.4 


68.8 


CO 
OJ 

in 


56.0 


66.2 




80.6 


Identity 
(%) 


99.7 


31.5 


32.8 


27.6 


24.2 


50.0 


99.9 




58.0 


55.6 






30.7 


41.7 


26.1 


27.0 


33.8 




61.7 


Honnologous gene 


Corynebacterium glutamicum 
ATCC 13032 nrdF 


Escherichia coli K12 ftnA 


Streptomyces coelicolor A3(2) 
WhiH 


Corynebacterium glutamicum 
ATCC 13869 dtxR 


Saccharomyces cerevisiae 
YPH14BYOR010C TIR2 


Archaeoglobus fulgidus AF0251 


Corynebacterium glutamicum 
ATCC 13032 nrdE 




Rickettsia prowazekii 


Bacillus subtilis 168 nadE 






Synechocystis sp. PCC6803 j 
sin 563 


Mycobacterium tuberculosis 
H37RV Rv3129 


Bacillus stearothermophilus 
DSM 2334 adh 


Bacillus subtilis 168 mmgE 


Arabidopsis thaliana T6K22.50 ! 




Escherichia coli K12 pgm 


db Match 


gp:AF112536_1 


_j 
O 
o 

ID 

<' 

Z 
1- 
u. 

CL 
(/> 


1 

T 
X 

CO 

< 
O 
U) 

d. 

cn 


pir:l40339 


< 
UJ 

>-J 

OJ 
01 

\- 
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m 


00 
CN 
Oi 
CO 

O 

l: 

Ol. 


CO 
CO 

in 

CVJ 

IJ_ 
< 

Oi 




q: 

Q- 

o 

1 

CD 
CO 
_l 

q: 

CO 


sp:NADE_BACSU 






pir:S76790 


pir:G70922 


sp:ADH2_BACST 


Z) 

CO 

o 
< 
m 

LU 

O 

Q. 
tn 


in 
o 
(- 

L.' 
O. 




_j 

O 
o 

UJ 

1 

Q- 

d. 
^n 


ORF 
(bp) 


1002 


CO 
OO 

■n- 


o 

in 


o 

CO 
CO 


CO 

CO 

TT 


CO 
CN 


2121 

1 


m 

CO 




T — 

CO 
OO 


CO 

cn 


CO 
O 




OO 
CO 
OJ 


1020 


1371 


CO 
CO 


CN 

cn 
r- 


1662 


Terminal 
(nt) 


2673338 


2675289 


2676240 


2676243 


2677377 


2676918 


2677478 


2680784 


2681 223 


2682376 


2681464 


2683616 


2682379 


2683131 


2683627 


2686289 


2687148 


2687449 


2688389 


Initial 
(nt) 


2674339 


2674804 


2675491 


2676902 


2676940 


2677193 


2679598 


2680470 


2681363 


2681546 


2681556 


2683119 


2683125 


2683418 


2684646 


2684919 


2686315 


2688240 


2690050 


to 2 5, 


6271 


6272 


6273 


6274 


6275 


6276 


6277 


6278 


6279 


6280 


6281 


6282 


6283 


6284 


6285 


6286 


6287 


CO 
CO 
CN 
CO 


6289 


UJ O 2 

S :2 e 


2771 


2772 


2773 


CM 


2775 


2776 


2777 


2778 


2779 


2780 


2781 


2782 


2783 


2784 


2785 


2786 


2787 ! 


2768 1 


2789 
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5 
10 


Function 


hypothetical membrane protein 


hypothetical membrane protein 


hypothetical protein ) 


transposase (IS1676) | 


major secreted protein PS1 protein 
precursor 








transposase(IS1676) 




proton/sodium-glutamate symport 
protein 




ABC transporter 




ABC transporter ATP-binding protein 


hypothetical protein 


hypothetical protein 




owdoreductase or dehydrogenase 


15 


Matched 
length 
(aa) 


OD 


CN 
CN 


LO 
CN 


CD 


lO 

in 

CO 








o 
o 
m 




00 
CO 




CO 
CO 




00 

eg 


CO 


CN 




CO 




>> 








































20 


Similari 
{%) 


64.3 


61.5 


79.1 


48.6 


49.6 








46.6 




66.2 




69.0 




79.8 


67.0 


75.0 




54.1 




Identity 

(%) 


41.7 


25.4 


51.2 


24.2 


CO 
CN 








24.6 




30.8 




33.0 




LO 


60.0 


71,0 




28.1 


25 

c 
o 

30 1- 

35 


Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv3069 


Helicobacter pylori J99 jhp1146 


Bacillus subtilts 168 ycsl 


Rhodococcus erythropolis 


Corynebacterium glutamtcum 
(Brevibacterium flavum) ATCC 
17965 csp1 








Rhodococcus erythropolis 




Bacillus subtilis 168 




Streptomyces coelicolor A3(2) 
SCE25.30 




staphylococcus aureus 


Chlamydophila pneumoniae 
AR39 CP0987 


Chlamydia muridarum Nigg 
TC0129 




Streptomyces collinus Tu 1892 
ansG 


40 


db Match 


pir:F70650 


CO 
00 

Q 

Q- 


sp:YCSI_BACSU 


gp:AF126281_1 


sp:CSP1_C0RGL 








gp:AF126281J 




sp:GLTT_BACCA 




gp:SCE25_30 




CN 

CO 
CO 

< 
w 

CD 


PIR:F81516 


PIR:F81737 




prf:2509388L 




LL 


CD 
CD 
CN 


CN 
CO 


CN 

cn 


1365 


1620 


LO 
rO 


in 

CD 




1401 


CO 
CO 


1338 


CO 
Oi 
CO 


2541 


CD 
00 


OD 
O 


CO 
CN 




CO 
CD 


CN 
CO 


45 


Terminal 
(nt) 


2690437 


2690760 


2691564 


2693053 


2694918 


2695279 


2695718 


2695320 


2697212 


2697383 


2698194 


2701612 


2699926 


2703356 


2702487 


2704586 


2704975 


2710555 


2711308 


50 


Initial 
(nt) 


2690150 


2690437 


2690773 


2691689 


2693299 


2694926 


2695554 


2695766 


2695812 


2698150 


2699531 


2700920 1 


2702466 


2702466 


2703194 


2704314 


2704835 


2709878 


2710637 




^ 41 


6290 


o 

CN 
CD 


6292 


6293 


CD 
CN 
CO 


6295 


6296 


6297 


6298 


6299 


6300 


6301 


6302 


6303 


o 

fO 
(D 


6305 


6306 


6307 


6308 


55 


SEQ 

NO. 
(DNA) i 

2790 


2791 


I CN 

i cn 
1 r-- 

|CN 


2793 


2794 ■ 


2795 


2796 


2797 


2798 


2799 


2800 

1 


2801 


2802 


CO 

o 

oo 
CN 


2804 


2805 


2806 


2807 


2808 
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5 
10 


Function 


methyltransferase 


hypothetical protein 


hypothetical protein 




UDP-N-acetylglucosamine 1- 
carboxyvinyltransferase 


hypothetical protein 


transcriptional regulator 




cysteine synthase 


O-acetylserine synthase 


hypothetical protein 


succinyl-CoA synthetase alpha 
chain 


hypothetical protein 


succinyl-CoA synthetase beta chain 




frenolicin gene E product 




succinyl-CoA coenzyme A 
transferase 


transcriptional regulator 


15 


Matched 
length 
(aa) 


m 
o 

CM 




CM 






o 


CO 
CM 




in 
o 
cn 


CM 

r- 


CO 
CO 


a> 

CM 


m 


o 
o 




CO 
CM 




o 
tn 


CN 
CO 


20 


Similarity 

. (%) 


51.2 


66.0 


75.0 




75.3 


84.2 


69.0 




84.6 


79.7 


65.1 


79.4 


43.0 


73.0 




71.8 




77.8 


68.5 




Identity 
(%) 


25.9 


61.0 


71.0 




44.8 


66.3 


45.9 




57.1 


61.1 


36.1 


52.9 


42.0 


39.8 




38.5 




47.9 


38.6 


25 

o 

30 X 

35 

40 


Homologous gene 


Mycobacterium tuberculosis 
H37RV RV0089 


Chlamydia pneumoniae 


Chlamydia muridamm Nigg 
TC0129 




Acinetobacter calcoaceticus 
NGIB 8250 murA 


Mycobacterium tuberculosis 
H37RvRv1314c 


Streptomyces coelicolor A3{2) 
SC2G5.15C 




Bacillus subtilis 168 cysK ! 


Azotobacter vinelandii cysE2 


Deinococcus radiodurans R1 
DR1844 


Coxiella burnetii Nine Mile Ph 1 
sucD 


Aeropyrum pernix K1 APE 1069 


Bacillus subtilis 168 sucC 




Streptomyces roseofulvus frnE 




Clostridium kluyveri cati cat1 


o 
o 

!< 

a 
tn 
c 
CJ 

m 

CO 

|y 

ex in 
«n ^ 
O 

NO 
< CM 


db Match 


1- 
O 

OJ 

o 
> 


GSP:Y35814 


PIR:F81737 




< 

a 

o 

cL 


i- 

O 
>- 

CM 
O 
> 

iCL 
tn 


gp:SC2G5_15 




sp:CYSK.BACSU 


prt:2417357C 


— 

gp:AE002024_10 


3 
CD 
X 

O 

^. 
o 
o 

3 
CO 

'd. 
cn 


PIR:F72706 


3 
CO 

O 
< 
m 

o' 

O 
3 

to 

a. 
tn 




qp:AF058302_5 




sp:CAT1_CL0KL 


sp:NIR3_AZ0BR 






LO 
CM 
lO 


CO 
CM 




in 
cn 


1254 


O 

in 


CO 
CO 


00 

o 


Tj- 

CM 

cn 


CO 

in 


CD 

CO 
CN 


CM 
CO 

GO 


in 

CM 
CM 


1194 


o 

CO 
CO 


in 

CO 

r-. 


<T> 
CO 


1539 


1143 


45 


Terminal 
(nt) 


2712374 


2713453 


2713842 


2717993 


2718436 


2720319 


2720385 


2721295 


2722857 


2723609 


2723770 


2724478 


2725643 


2725384 


2726786 


2727399 


2728207 


2729378 


2732518 


50 


Initial 
(nt) 


2711850 


2713181 


2713702 


2718187 


2719689 


2719750 


2721227 


2721702 


2721934 


2723064 


2724057 


2725359 


2725619 


2726577 


2727145 


2728133 


2729025 


2730916 


2731376 






6309 


6310 


6311 


6312 


6313 


6314 


6315 


6316 


6317 


6318 


6319 


6320 


6371 


6322 


CO 
CM 
CO 

: ^ 


6324 


6325 


6326 


6327 


55 


SEQ 
NO. 
(DNA) 

2809 


o 

CO 
CN 


2811 


2812 


2813 


2814 


2815 


2816 


GC 
CN 


CO 

GC 
C\ 


2819 


2820 


U-ICN 

CM ' CM 
i to : CO 
i CN : CN 


CO 
CM 
CO 
CM 


CN 
CO 
CN 


in 

CM 
00 


2826 


2827 
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5 
10 


Function 




phosphate transport system 
regulatory protein 


phosphate-specific transport 
component 


phosphate ABC transport system 
permease protein 


phosphate ABC transport system 
permease protein 


phosphate-binding protein S-3 
precursor 


acetyltransferase 




hypothetical protein 


hypothetical protein 


branched*chain amino acid 
aminotransferase 


hypothetical protein 


hypothetical protein 


5*-phosphoribosyl-5-aminoimida2ole 
synthetase 


amidophosphoribosyl transferase 


15 


Matched 
length 
(aa) 




CO 
CM 


tn 
tn 

CM 


CM 

Oi 

CM 


in 

CM 
CO 


O) 
CD 
CO 


in 

CO 




ro 


in 

CM 
CM 


o> 
in 
OJ 


CM 

in 

CO 


00 

in 


ro 


CO 


20 


Similarity 
(%) 




81.7 


82.8 


82.2 


to 
cc> 


56.0 


60.0 




55.2 


74.2 


0 
to 
in 


79.0 


81.0 


94.2 


89.0 




Identity 
(%) 




46.5 


58.8 


51.4 


50.2 


40.0 


34.3 




24.7 


Oi 


28.6 


58.5 


to 

CO 

in 


81.0 


70.3 


25 

-a 

(U 
Z3 
C 

o 
o 

30 ^ 

to 
n 

35 


Homologous gene 




Mycobacterium tuberculosis 
H37RV Rv0821cphoY-2 


Pseudomonas aeruginosa pstB 


Mycobacterium tuberculosis 
H37RV Rv0830 pstAI 


Mycobacterium tuberculosis 
H37Rv Rv0829 pstC2 


Mycobacterium tuberculosis 
H37RV phoS2 


Streptomyces coelicolor A3{2) 
SCD84.18C 




Bacillus subtilis 168 bmrU 


Mycobacterium tuberculosis 
H37RV Rv0813c 


Solanum tuberosum BCAT2 


Corynebacterium 
ammoniagenes ATCC 6872 
0RF4 


Mycobacterium tuberculosis 
H37Rv Rv0810c 


Corynebacterium 
ammoniagenes ATCC 6872 
purM 


Corynebacterium 
ammoniagenes ATCC 6872 
purF 
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CO 
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CM 
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CM 


1074 


1482 


45 


Terminal 
(nt) 


2731424 


2733367 


2733455 


2734264 


2735202 


2736414 


2737836 


2739553 


2739556 


2741356 


2741636 


2743785 


2744222 


2744881 


2746083 
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Initial 
(nt) 


2732230 


2732636 


2734351 


2735184 


2736215 


2737538 


2738711 


2738771 


2740650 
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to 

OJ 
M- 
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2745954 


2747564 




w 2: e 
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6332 
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55 
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NO. 
(DNA) 
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0 
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OJ 
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CM 
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tein 




ase 




tase 


















Function 


hypothetical protein 


hypothetical protein 


hypothetical membrane proi 


hypothetical protein 


5'-phosphoribosyI-N- 
formylglycinamidine synthel 




5'-phoSphoribosyl-N- 
formylglycinamidine synthel 


hypothetical protein 




gluthatione peroxidase 


extracellular nuclease 




hypothetical protein 


C4-dicarboxyIate transport* 


1 dipeptidyl aminopeptidase 


































Matche 
length 
(a.aj 


CM 


in 
t— 

CO 




CN 


CO 
CD 
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CO 
CN 
CN 
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CO 

m 


m 

CD 
Oi 




CN 




Oi 
CO 




































75.8 


p 


87,1 


71.0 


89.5 




93.3 


93.7 




77.9 


51.5 




68.7 


81.6 


70.6 


"co 
































Identity 
{%) 


CO 

in 


75.9 


67.7 


64.0 


77.6 




CO 

0 

CO 


81.0 




46.2 


28.0 




37.4 


0 

ay 


41.8 


Honnologous gene 


Mycobacterium tuberculosis 
H37RV RV0807 


Corynebacterium ^ 
ammoniagenes ATCC 6872 
Of?F2 


Corynebacterium 
ammoniagenes ATCC 6872 
0RF1 


Sulfolobus solfataricus 


Corynebacterium 
ammoniagenes ATCC 6872 
purL 




Corynebacterium 
ammoniagenes ATCC 6872 
purQ 


Corynebacterium 
ammoniagenes ATCC 6872 
purorf 




Lactococcus lactis gpo 


Aeromonas hydrophila JMP636 
nucH 




Mycobacterium tuberculosis 
H37RV RV0784 


Salmonella typhimurium LT2 
dctA 


Pseudomonas sp. W024 dapbl 
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pir:C70709 


sp:DCTA,SALTY 


prf:2408266A 


u 


in 

CO 


1017 




to 

CO 


2286 


0 

CM 


<j> 
to 

CD 


CO 
CN 


CN 
CN 

in 




2746 


to 

CN 
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CD 
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2753328 
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2757129 


2757863 


2759532 


Initial 
(nt) 


2748057 


2748095 


2749902 


2751918 


2752312 


2752402 


2752995 


2753237 


2753298 ' 


2753804 


2753992 


2756851 


2757815 


2759200 


2761649 
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6346 


6347 
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6349 
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6354 
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CO 
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2857 



186 



EP1 108 790 A2 



10 



15 



o 



J C to 



20 



5£ 

■o 



25 



30 



3 



."5 



a> 
c 
a> 

O) 

to 

3 
O 



E 
o 
X 



35 



40 



X3 

■a 



45 



50 



So d 

(O ^ 



o 
c 

6 

IS iS 

o c 

<A O 

O c 



O 
N 

Q. O ^ 
3 C 



O) 
CM 



U 
3 
CO 

o 



GO 



CM 

00 
CO 

o 

O 

-— o 

(u c: 

? § 
o £ 3 



O 

o 

CD 
< 



in 



CO 
CO 

O 
O 

e5 

O) 

O) c 
c o 
^ E CD 

O E 3 
O (0 Q. 



CO 

o 
o 

CD 

< 



«/) 
c 

CO 

o 
c 

E 



in 

Oi 
CO 



CM 
CO 



oo 

CM ■ 



=J o> 

t/3 



0) 




•o 




E 




nj 




c 








;^ 












VI 




o 




X) 




•c 




o 


0) 


JC 


to 




TO 


hos 


thet 


a. 


c 




>^ 




(O 



CM 



CM 
CO 

to 

a 
o 

Z> CO 

a, 
O) c 

s g 

er^EQ 

o £ = 

CJ CO Q. 



o 

if) 
_i 



5 



CD 
CM 



55 



O ^ < 
uj O z 

S 2 Q 



CO 
(M 
CO 



CO 
CM 



O 

in 
ro 



CO 



I CD 
! ^ 



t 



to 



QO 



I O 
CD 

! 00 



OO 

in 



CD 

r- 

CM 



CO 
CO 



CD 
CO 
CO 



CO 

o 
o 

CD 

< 



CO 

to 

CM 



CO 
CO 

CM 



o 

CM 



CM 
CD 
CO 
CO 



CO 
CO 

, CN 



CM 
CD 
CO 
CM 



Q. 
CL 



CO 



o 

6 2 

c <a 
.= >«- 

i ^ 

) o 

i| 

O ™ 

£ S 

J i 

trt c 

o o 

c c 

a> o 

"O X 

R3 O 



(0 
O 

'c 
o 

O) a, 

Qi 2 
cx o 

.is 

E S 

? .£ 
« E 



CO 
CM 



—I T3 



o 



OO 
GO 



in 



E CM 
3 -3 

^} 

|i 

11 

3 

ca o 

X3 TO 

a> < 
o ±; o 
O CD - 



CD 

tr 

O 
O 

<' 
O 
CD 



CO 

in 



to 



in 
o 



CO 
CD 



CO 
CM 



o 

g 

o5 

T3 



CM 



CD 

cii 



CO 

£ C4 
=3 -) 

|i 

CO 



CO "o 
^ TO 
O ^ 

£^ o D 



O 
O 

^. 
Q 

g 



CO 



o 

CO 
CO 
CM 



o 

CM 



CO 
CO 
CO 



CO 
CO 
CO 
CM 



CO 
CO 
CM 



CJ> 
CO 



OO 
CD 



E g 

o .5 



m 

CO 
CO 



m 
d 



CO 
CO 



o to 
o o 

O CO 

^5 

TO Z 

-J ex. 



o 

TO 



1 S 



o 

O) 

i: 

tvj OJ 

c in 
<L> ^ 

o n) 

V. w 

^5 

Q 
^ >^ 

CO 

c> _ 
to O) 

_L CL 
CD »/> 

E ° 



CO 
CM 



C35 
CM 



CM 
CO 
CO 



O 
CM 



TO 

£ 



TO 
O) 
O 

E 



< 
a. 



> 



o 
>» 
E 
o 



CO 

co 



< 

C7) 



< 



< 

CO 

CM 
CM 
CM 

•tf 
Q. 



< 



< 
o 



CM 



O 



CD 
CM 



O 



o 



00 



CN 



CD 

in 



CO 
CO 



to 

CO 

to 



CO 
CD 



CO 
CD 



CO 
CD 



CO 
CO 
CO 
CM 



CO 
CO 
CM 



o 1 ^ 

CO ! GO 
CN CM 



187 



EP1 108 790 A2 



5 
10 


Function 


pyruvate oxidase 


multidrug efflux protein 


transcriptional regulator 


hypothetical membrane protein 




3-ketosteroid dehydrogenase 


transcriptional regulator, LysR family 


hypothetical protein 


hypothetical protein 




hypothetical protein 


hypothetical membrane protein 


transcription initiation factor sigma 


trehaIose-6-phosphate synthase 




trehalose-phosphatase 


glucose-resistance amylase 
regulator 


htgh-affinity zinc uptake system 
protein 


15 


Matched 
length 
(aa) 


in 


0 

m 


CN 


CN 




CO 

0 

CO 


CN 
CO 
CN 


00 

CN 


CO 
CO 
CN 




0 


CO 


to 
in 


CO 




to 

CN 


CO 


CO 

in 

CO 




Similarity 
(%) 






































20 


75.8 


68.9 


68.5 


78.4 




62.1 


0 

ai 

(O 


52.9 


55.6 




50.7 


64.0 


50.3 


66.7 




57.6 


60.2 


46.7 




Identity 
(%) 


46.3 


33.3 


0 

CO 


45.6 




34.3 


37.1 


28.4 


26.7 




CO 
CO 
CN 


36.0 


32.3 


38.8 




27.4 


24.7 
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CN 


25 

03 
3 
C 

o 
o 

30 ^ 
1— 

to 

35 


Homologous gene 


Escherichia coli K12poxB 


Staphylococcus aureus plasmid 
pSk23 qacB 


Escherichia coli K12ycdC 


Mycobacterium tuberculosis 
H37RV Rv2508c 




Rhodococcus erythropolis SQ1 
kstpl 


Bacillus subtilis 168 alsR 


Mycobacterium tuberculosis 
H37RV Rv3298c IpqC 


Bacillus subtilis 168 ykrA 




Or^/ctolagus cuniculus kidney 
cortex rBAT 


Mycobacterium tuberculosis 
H37Rv Rv3737 


Streptomyces griseus hrdB 


Schizosaccharomyces pombe 
tpsi 




Escherichia coli K12 otsB 


Bacillus megaterium ccpA 


Haemophilus influenzae Rd 
HI0119znuA 


40 
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0' 
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0 

CL 
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2780446 
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2788594 
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2789477 
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2792448 


2792857 


2794327 
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2795676 
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13 



03 



Function 


dihydrodipicolinate synthase 


glucokinase 


N-acetylmannosamine-6-phosphate 
epimerase 




sialidase precursor 


L-asparagine permease operon 
repressor 


dipeptide transporter protein or 
heme-binding protein 


dipeptide transport system 
permease protein 


oligopeptide transport ATP-binding 
protein 


oligopeptide transport ATP-binding 
protein 


homoserine/homoserin lactone 
efflux protein or lysE type 
translocator 


leucine-responsive regulatory 
protein 




hypothetical protein 


hypothetical protein 


transcription factor 


Matched 
length 
(aa) 


OO 

cn 

CN 


CN 
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O 
CN 
CN 




CD 
CO 


CN 
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CO 
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CO 
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00 
lO 
CM 


CO 
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CM 
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Similari' 
(%) 


CN 
CO 


57.6 


68.6 




50.3 


57.2 


51.4 


64.3 


78.3 


78.7 




CM 
CO 


66.2 




86.2 


71.5 


91.1 


Identity 
(%) 


28.2 


CD 
CN 


36.4 




24.8 


26.6 


22.5 


31.9 


46.5 


43.4 


28.5 


31.0 




55.9 


46,4 


73.3 


Homologous gene 


Escherichia coli K12dapA 


Streptomyces coelicolor A3(2) 
SC6E10.20C gik 


Clostridium perfringens NCTC 
8798 nanE 




Micromonospora viridifaciens 
ATCC 31146 nadA 


Rhizobium etli ansR 


Bacillus firmus 0F4 dppA 


Bacillus firmus 0F4 dappB 


Bacillus subtilis 168 oppD ' 


Lactococcus lactis oppF 


Escherichia coli K12rhtB 


Bradyrhizobium japonicum Irp 




Mycobacterium tuberculosis 
H37RV Rv3581c 


Mycobacterium tuberculosis 
H37RV Rv3582c 


Mycobacterium tuberculosis 
H37RV Rv3583c 
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(nt) 


2816393 


2817317 


2818058 


2818137 


2818350 


2819557 


2822191 


2823337 


2825341 


2826156 


2826215 


2827404 


2827458 


2827904 


2828379 


2829156 


Initial 
(nt) 


2815458 


2816409 


2817363 


2818313 


2819564 


2820285 


2820584 


2822387 


2824274 


2825341 


2826835 


2826922 


2827817 


2828383 


2829146 


2829749 
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Function 


two-component system response 
regulator 


two-component system sensor 
histidine kinase 




DNA repair protein RadA 


hypothetical protein 


hypothetical protein 


p-hydroxyben2aldehyde 
dehydrogenase 




mitochondrial carbonate 
dehydratase beta 


A/G-specific adenine glycosylase 






L-Z3-butanedtol dehydrogenase 








hypothetical protein 


virulence factor 


virulence factor 


Matched 
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virulence factor 


CIpC adenosine triphosphatase / 
ATP-binding proteinase 


inosine monophosphate 
dehydrogenase 


transcription factor 


phenol 2-monoo)cygenase 










lincomycin resistance protein 


hypothetical protein 


lysyl-tRNA synthetase 


pantoate-beta-alanine iigase 
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GTP cyclohydrolase 1 




cell division protein FtsH 


hypoxanthine 
phosphoribosyltransferase 


cell cycle protein MesJ or cytosine 
deaminase-related protein 


D-alanyl-D-alanine 
carboxypeptidase 
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Function 


peptide synthase 




phenylacetaldehyde dehydrogenase 


hypothetical protein 


hypothetical protein 


hypothetical protein 


heat shock protein or chaperon or 
groEL protein 














hypothetical protein 






peptidase 






Na+/H+ antiporter or multiple 
resistance and pH regulation related 
protein A or NADH dehydrogenase 
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Function 




membrane transport protein or 
btcyclomycin resistance protein 


sodium dependent phosphate pump 


phenazine biosynthesis protein 




ABC transporter 


ABC transporter ATP-binding protein 


mutator mutT protein 


hypothetical membrane protein 


glutamine-binding protein precursor 


serineAhreonine kinase 




ferredoxin/ferredoxin-NADP 
reductase 


acetyltransferase (GNAT) family 
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Function 


insertion element (IS3 related) 


insertion element (IS3 related) 


two-component system sensor 
histtdine kinase 


transcriptional regulator 




adenylosuccinate synthetase 


hypothetical protein 




hypothetical membrane protein 


fructose-bisphosphate aldolase 


hypothetical protein 


methyltransferase 


orotate phosphoribosyltransferase 


hypothetical protein 


3-mercaptopyruvate 
sulfurtransferase 
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Table 1 (continued) 


Homologous gene 


Corynebacterium glutamicum 


Corynebacterium glutamicum 
orti 


Streptomyces thermoviolaceus 
opc-520 chiS 


Bacillus brevis ALK36 degU 




Corynebactertum 
amrnoniagenes purA 


Mycobacterium tuberculosis 
H37RV Rv0358 




Corynebacterium glutamicum 
AS019ATCC 13059 0RF3 


Corynebacterium glutamicum 
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Mycobacterium leprae 
MLCB1883.13C 






Mycobacterium leprae 
MLCB1883.05C 


Streptomyces sp. acyA 
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Mycobacterium tuberculosis 
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H37RV RV0224C 
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Function 


hypothetical membrane protein 


hypothetical membrane protein 


propionyl-CoA carboxylase complex 
B subunit 


polyketide synthase 


acyl-CoA synthase 


hypothetical protein 




major secreted protein PS1 protein 
precursor 






antigen 85-C 


hypothetical membrane protein 


nodulation protein 


hypothetical protein 


hypothetical protein 




phosphatidic acid phosphatase 
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Table 1 (continued) 


Homologous gene 


Mycobacterium tuberculosis 
H37RV Rv0204c 


Mycobacterium tuberculosis 
H37RV Rv0401 


Slreptomyces coelicolor A3(2) 
pccB 


Streptomyces erythraeus eryA 


Mycobacterium bovis BCG 


Mycobacterium tuberculosis 
H37RV Rv3802c 




Corynebacterium glutamicum 
(Brevibacterium flavum) ATCC 
17965 copi 






Mycobacterium tuberculosis 
ERDMANN RV0129CfbpC 


Mycobacterium tuberculosis 
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Azorhizobium caulinodans 
ORS571 noeC 


Mycobacterium tuberculosis 
H37RV Rv3807c 


Mycobacterium tuberculosis 
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Function 


transcriptional regulator 








hypothetical protein 


glucan 1,4-alpha-glucosidase 




glycerophosphoryl diester 
phosphodiesterase 


gluconate permease 






pyruvate kinase 


L-lactate dehydrogenase 


hypothetical protein 


hydrolase or haloacid 
dehalogenase-like hydrolase 


efflux protein 


transcription activator or 
transcriptional regulator GntR family 


phosphoesterase 


shikimate transport protein | 
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Streptomyces coelicolor A3(2) 
SC6G4.33 








Streptomyces lavendulae 
ORF372 


Saccharomyces cerevisiae 
S2S8C YIR019C sta1 




Bacillus subtilis gIpQ 


Bacillus subtilis gntP 






Corynebacterium gtutamicum 
AS019pyk 


Brevibacterium flavum IctA 


Mycobacterium tuberculosis 
H37RV Rv1069c 


Streptomyces coelicolor A3(2) 
SG1C2.30 


Brevibacterium linens 0RF1 
tmpA 


Escherichia coll K12MG1655 
glcC 
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Escherichia coli K12 shiA 
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Function 


L-lactate dehydrogenase or FMN- 
dependent dehydrogenase 




immunity repressor protein 






phosphatase or reverse 
transcriptase (RNA-dependent) 




peptidase or lAA-amino acid 
hydrolase 




peptide methionine sulfoxide 
reductase 


superoxide dismutase (Fe/Mn) 


transcriptional regulator 


multidrug resistance transporter 








hypothetical protein 


membrane transport protein 


transcriptional regulator 


two-component system response 
regulator 
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Homologous gene 


Neisseria meningitidis lldA 




Bacillus phage phi-105 0RF1 






Caenorhabditis elegans 
Y51B11A.1 




Arabidopsts thaliana illl 




Escherichia coli B msrA 


Corynebacterium 
pseudodiphtheriticum sod 


Bacillus subtilis gItC 


Corynebacterium glutamicum 
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Mycobacterium tuberculosis 
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Function 






two-componenl system sensor 
histidine kinase 


hypothetical protein 


hypothetical protein 


stage III sporulation protein 


transcriptional repressor 


Iransglycosylase-associated protein 


hypothetical protein 


hypothetical protein 


RNA pseudouridylate synthase 


hypothetical protein 


hypothetical protein 




bacterial regulatory protein, gntR 
family or glc operon transcriptional 
activator 


hypothetical protein 


hypothetical protein 
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Table 1 (continued) 


Homologous gene 






Corynebacterium diphtheriae 
ChrS 


Streptomyces coelicolor A3(2) 
SCH69.22C 


Streptomyces coelicolor A3(2) 
SCH69.20C 


Bacillus subtilis spoltlJ 


Mycobacterium tuberculosis 
H37RV Rv3173c 


Escherichia coli K12 MG1655 
tagi 


Mycobacterium tuberculosis 
H37RV Rv2005c 


Escherichia coli K12 MG1655 
yhbW 


Chlorobium vibrioforme ybc5 


Chlamydia pneumoniae 


Chlamydia muridarum Nigg 
TC0129 




Escherichia coii K12MG1655 
glcC 


Streptomyces coelicolor 
SC4G6.31C 


Mycobacterium tuberculosis 
H37RV Rv2744c 
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Function 












methyltransferase 


nodulin 21-related protein 








transposon tnSOl resolvase 




ferredoxin precursor 


hypothetical protein 


transposase 


1 

transposase protein fragment 
TnpNC 




glyceraIdehyde-3-phosphale 
dehydrogenase (pseudogene) 


lipoprotein 


copper/potassium-transporting 
ATPase B or cation transporting 
ATPase (E1-E2 famity) 
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Table 1 (continued) 


Homologous gene 












Streptomyces coelicolor A3(2) 
SCD35.11C 


soybean N021 








Pseudomonas aeruginosa TNP5 




Saccharopolyspora erythraea fer 
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thioredoxin ch2, M-type 


N-acetylmuramoyl-L-alanine 
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hypothetical protein 


hypothetical protein 


partitioning or sporulation protein 


glucose inhibited division protein B 
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Table 1 (continued) 


Homologous gene 
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Function 


elongation factor Tu 


preprotein translocase secY subuit 


isocitrate dehydrogenase 
(oxalosuccinatedecarboxylase) 


acyl-CoA carboxylase or biotin- 
binding protein 


citrate synthase 


putative binding protein or peptidyl- 
prolyl cis-trans isomerase 


glycine betaine transporter 
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L-lysine permease 
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succinyl diaminopimelate 
desuccinylase 


proline transport system 
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Table 1 (continued) 


Homologous gene \ 
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5 
10 


Function 


urtdilylyltransferase, uridilylyl- 
removing enzyme 


nitrogen regulatory protein P-II 


ammonium transporter 


glutamate dehydrogenase (NADP+) 


pyruvate kinase 


glucokinase 


glutamine synthetase 


threonine synthase 


ectoine/proline/glycine betaine 
carrier 


malate synthase 


isocitrate lyase 


glutamate 5-kinase 


cystathionine gamma-synthase 


ribonucleotide reductase 


glutaredoxin 


15 
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5 
10 


Function 


meso-diaminopimelate D- 
dehydrogenase 


porin or cell wall channel forming 
protein 


acetate kinase 


phosphate acetyltransf erase 


multidrug resistance protein or 
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drug:proton antiporter 


ATP-dependent protease regulatory 
subunit 


prephenate dehydratase 
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Example 2 

Determination of effective mutation site 

5 (1) Identification of mutation site based on the comparison of the gene nucleotide sequence of lysine-producing B-6 
strain with that of wild type strain ATCC 13032 

[0374] Corynebacterium glutamicum B-6, which is resistant to S-(2-aminoethyl)cysteine (AEC), rifampicin. strepto- 
mycin and 6-azauracil, is a lysine-producing mutant having been mutated and bred by subjecting the wild type ATCC 

10 13032 strain to multiple rounds of random mutagenesis with a mutagen, N-methyl-N' -nitro-N-nitrosoguanidine (NTG) 
and screening {Appl. Microbiol. Biotechnol., 32: 269-273 (1989)). First, the nucleotide sequences of genes derived 
from the B-6 strain and considered to relate to the lysine production were determined by a method similar to the above. 
The genes relating to the lysine production include lysE and iysG which are lysine-excreting genes; ddh, dapA, horn 
and lysC (encoding diaminopimelate dehydrogenase, dihydropicolinate synthase, homoserine dehydrogenase and 

^5 aspartokinase, respectively) which are lysine-biosynthetic genes; and pyc and zwf (encoding pyruvate carboxylase 
and glucose-6-phosphate dehydrogenase, respectively) which are glucose-metabolizing genes. The nucleotide se- 
quences of the genes derived from the production strain were compared with the corresponding nucleotide sequences 
of the ATCC 13032 strain genome represented by SEQ ID NOS:1 to 3501 and analyzed. As a result, mutation points 
were observed in many genes. For example, no mutation site was observed in /ysE, lysG, ddh, dapA, and the like, 

20 whereas amino acid replacement mutations were found in horn, lysQ pyc, zwf, and the like. Among these mutation 
points, those which are considered to contribute to the production were extracted on the basis of known biochemical 
or genetic information. Among the mutation points thus extracted, a mutation, Val59Ala, in horn and a mutation, 
Pro458Ser, in pyc were evaluated whether or not the mutations were effective according to the following method. 

25 (2) Evaluation of mutation, Val59Ala, in ham and mutation, Pro458Ser, in pyc 

[0375] It is known that a mutation in hom inducing requirement or partial requirement for homoserine imparts lysine 
productivity to a wild type strain (Amino Acid Fermentation, ed. by Hiroshi Aida ef a/., Japan Scientific Societies Press). 
However, the relationship between the mutation, Val59Ala, in hom and lysine production is not known. It can be ex- 

30 amined whether or not the mutation, Val59Ala, in hom is an effective mutation by introducing the mutation to the wild 
type strain and examining the lysine productivity of the resulting strain. On the other hand, it can be examined whether 
or not the mutation, Pro458Ser, in pyc is effective by introducing this mutation into a lysine-producing strain which has 
a deregulated lysine-bioxynthetic pathway and is free from the pyc mutation, and comparing the lysine productivity of 
the resulting-strain with the parent strain. As such a lysins-producing bacterium, No=-58 strain (PERM BP-71 34) was 

36 selected (hereinafter referred to the "lysine-producing No. 58 strain" or the "No. 58 strain"). Based on the above, it was 
determined that the mutation, Val59Ala, in hom and the mutation, Pro458Ser, in pyc were introduced into the wild type 
strain of Corynebacterium glutamicum ATCC 13032 (hereinafter referred to as the "wild type ATCC 13032 strain" or 
the "ATCC 13032 strain") and the lysine-producing No. 58 strain, respectively, using the gene replacement method. A 
plasmid vector pCES30 for the gene replacement for the introduction was constructed by the following method. 

40 [0376] A plasmid vector pCE53 having a kanamycin- resistant gene and being capable of autonomously replicating 
in Coryneform bacteria (/Wo/. Gen. Genet, 196: 175-178 (1984)) and a plasmid pMOB3 (ATCC 77282) containing a 
levansucrase gene (sacB) of Bacillus subtilis {Molecular Microbiology 6: 1195-1204 (1992)) were each digested with 
Psfl. Then, after agarose gel electrophoresis, a pCE53 fragment and a 2.6 kb DNA fragment containing sacB were 
each extracted and purified using GENECLEAN Kit (manufactured by BIO 101). The pCE53 fragment and the 2.6 kb 

45 DNA fragment were ligated using Ligation Kit ver. 2 (manufactured by Takara Shuzo), introduced into the ATCC 13032 
strain by the electroporation method (FEMS Microbiology Letters, 65: 299 (1 989)), and cultured on BYG agar medium 
(medium prepared by adding 1 0 g of glucose, 20 g of peptone (manufactured by Kyokuto Pharmaceutical), 5 g of yeast 
extract (manufactured by Difco), and 16 g of Bactoagar (manufactured by Difco) to 1 liter of water, and adjusting its 
pH to 7.2) containing 25 ^ig/ml kanamycin at 30°C for 2 days to obtain a transformant acquiring kanamycin- resistance. 

50^ As a result of digestion analysis with restriction enzymes, it was confirmed that a plasmid extracted from the resulting 
transformant by the alkali SDS method had a structure in which the 2.6 kb DNA fragment had been inserted into the 
Psfl site of pCE53. This plasmid was named pCES30. 

[0377] Next, two genes having a mutation point, hom and pyc, were amplified by PCR, and inserted into pCES30 
according to the TA cloning method (Bio Experiment Illustrated vol. 3, published by Shujunsha). Specifically, pCES30 
55 was digested with SamH! (manufactured by Takara Shuzo), subjected to an agarose gel electrophoresis, and extracted 
and purified using GENECLEAN Kit (manufactured by BIO 101). The both ends of the resulting pCES30 fragment were 
blunted with DNA Blunting Kit (manufactured by Takara Shuzo) according to the attached protocol. The blunt-ended 
pCES30 fragment was concentrated by extraction with phenol/chloroform and precipitation with ethanol, and allowed 
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to react in the presence of Taq polymerase (manufactured by Roche Diagnostics) and dTTP at 70**C for 2 hours so 
that a nucleotide, thymine (T), was added to the 3'-end to prepare a T vector of pCES30. 

[0378] Separately, chromosomal DNA was prepared from the lysine-producing B-6 strain according to the method 
of Salto et al. {Biochem, Biophys. Acta, 72. 61 9 (1 963)). Using the chromosomal DNA as a template, PGR was carried 

5 out with Pfu turbo DNA polymelase (manufactured by Stratagene). In the mutated horn gene, the DMAs having the 
nucleotide sequences represented by SEQ ID NOS:7002 and 7003 were used as the primer set. In the mutated pyc 
gene, the DNAs having the nucleotide sequences represented by SEQ ID NOS:7004 and 7005 were used as the primer 
set. The resulting PGR product was subjected to agarose gel electrophoresis, and extracted and purified using GENE- 
GLEAN Kit (manufactured by BI0 101). Then, the PGR product was allowed to react in the presence of Taq polymerase 

10 (manufactured by Roche Diagnostics) and dATP at 72°G for 10 minutes so that a nucleotide, adenine (A), was added 
to the 3'-end. 

[0379] The above pGES30 T vector fragment and the mutated horn gene (1 J kb) or mutated pyc gene (3.6 kb) to 
which the nucleotide A had been added of the PGR product were concentrated by extraction with phenol/chloroform 
and precipitation with ethanol, and then ligated using Ligation Kit ver. 2. The ligation products were introduced into the 

15 ATGG 13032 strain according to the elect roporati on method, and cultured on BYG agar medium containing 25 |xg/ml 
kanamycin at 30°C for 2 days to obtain kanamycin- resistant transformants. Each of the resulting transformants was 
cultured overnight in BYG liquid medium containing 25 (xg/ml kanamycin, and a plasmid was extracted from the culturing 
solution medium according to the alkali SDS method. As a result of digestion analysis using restriction enzymes, it was 
confirmed that the plasmid had a structure in which the 1 .7 kb or 3.6 kb DNA fragment had been inserted into pGES30. 

20 The plasmids thus constructed were named respectively pChom59 and pGpyc458. 

[0380] The introduction of the mutations to the wild type ATGG 13032 strain and the lysine-producing No. 58 strain 
according to the gene replacement method was carried out according to the following method. Specifically, pGhom59 
and pGpyc458 were introduced to the ATGG 13032 strain and the No. 58 strain, respectively, and strains in which the 
plasmid is integrated into the chromosomal DNA by homologous recombination were selected using the method of 

25 Ikeda et al. (Microbiology 144: 1863 (1998)). Then, the stains in which the second homologous recombination was 
carried out were selected by a selection method, making use of the fact that the Bacillus subtilis levansucrase encoded 
by pGES30 produced a suicidal substance (J. of Bacterloi, 174: 5462 (1992)). Among the selected strains, strains in 
which the wild type horn and pyc genes possessed by the ATGG 1 3032 strain and the No. 58 strain were replaced with 
the mutated horn and pyc genes, respectively, were isolated. The method Is specifically explained below. 

30 [0381] One strain was selected from the transformants containing the plasmid, pGhom59 or pCpyc458, and the 
selected strain was cultured in BYG medium containing 20 ^ig/ml kanamycin, and pGG11 (Japanese Published Exam- 
ined Patent Application No. 91827/94) was introduced thereinto by the electroporation method. pCGII is a plasmid 
vector having a spectinomycin-resistant gene and a replication origin which is the same as pGE53. After introduction 
of the pGGii, the strain was culturedon BYG agar medium containing 20 ^g^'ml kanamycin and 100|ig/ml spectinomycin 

35 at 30°G for 2 days to obtain both the kanamycin- and spectinomycin-resistant transformant. The chromosome of one 
strain of these transformants was examined by the Southern blotting hybridization according to the method reported 
by Ikeda et al. {Microbiology 144: 1863 (1998)). As a result, it was confirmed that pGhom59 or pGpyc458 had been 
integrated into the chromosome by the homologous recombination of the Gambell type. In such a strain, the wild type 
and mutated horn or pyc genes are present closely on the chromosome, and the second homologous recombination 

40 is liable to arise therebetween. 

[0382] Each of these transformants (having been recombined once) was spread on Sue agar medium (medium 
prepared by adding 100 g of sucrose, 7 g of meat extract, 10 g of peptone, 3 g of sodium chloride, 5 g of yeast extract 
(manufactured by Difco), and 18 g of Bactoagar (manufactured by Difco) to 1 liter of water, and adjusting its pH 7.2) 
and cultured at 30°G for a day. Then the colonies thus growing were selected in each case. Since a strain in which the 

45 sacB gene is present converts sucrose into a suicide substrate, it cannot grow in this medium {J. Bacterloi. , 174: 5462 
(1 992)). On the other hand, a strain in which the sacB gene was deleted due to the second homologous recombination 
between the wild type and the mutated horn or pyc genes positioned closely to each other forms no suicide substrate 
and, therefore, can grow in this medium. In the homologous recombination, either the wild type gene or the mutated 
gene is deleted together with the sacB gene. When the wild type is deleted together with the sacB gene, the gene 

50 replacement into the mutated type arises. 

[0383] Ghromosomal DNA of each the thus obtained second recombinants was prepared by the above method of 
Saito et al, PGR was carried out using Pfu turbo DNA polymerase (manufactured by Stratagene) and the attached 
buffer. In the horn gene, DNAs having the nucleotide sequences represented by SEQ ID NOS:7002 and 7003 were 
used as the primer set. Also, in the pyc gene was used, DNAs having the nucleotide sequences represented by SEQ 

55 ID NOS:7004 and 7005 were used as the primer set. The nucleotide sequences of the PGR products were determined 
by the conventional method so that it was judged whether the horn or pyc gene of the second recombinant was a wild 
type or a mutant. As a result, the second recombinant which were called HD-1 and No. 58pyc were target strains having 
the mutated horn gene and pyc gene, respectively. 
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(3) Lysine production test of HD-1 and No. 58pyc strains 

[0384] The HD-1 strain (strain obtained by incorporating the mutation, Val59Ala, in the horn gene into the ATCC 
13032 strain) and the No. 58pyc strain (strain obtained by incorporating the mutation, Pro458Ser, in the pyc gene Into 

5 the lysine-producing No. 58 strain) were subjected to a culture test in a 5 I jar fermenter by using the ATCC 13032 
strain and the lysine-producing No. 58 strain respectively as a control. Thus lysine production was examined. 
[0385] After culturing on BYG agar medium at 30°C for 24 hours, each strain was inoculated into 250 ml of a seed 
medium (medium prepared by adding 50 g of sucrose, 40 g of corn steep liquor, 8.3 g of ammonium sulfate, 1 g of 
urea, 2 g of potassium dihydrogenphosphate, 0.83 g of magnesium sulfate heptahydrate, 10 mg of iron sulfate hep- 

10 tahydrate, 1 mg of copper sulfate pentahydrate, 1 0 mg of zinc sulfate heptahydrate, 1 0 mg of p-alanine, 5 mg of nicotinic 
acid, 1 .5 mg of thiamin hydrochloride, and 0.5 mg of biotin to 1 liter of water, and adjusting its pH to 7.2, then to which 
30 g of calcium carbonate had been added) contained in a 2 1 buffle-attached Erienmeyer flask and cultured therein 
at 30°C for 12 to 16 hours. A total amount of the seed culturing medium was inoculated into 1 ,400 ml of a main culture 
medium (medium prepared by adding 60 g of glucose, 20 g of corn steep liquor, 25 g of ammonium chloride, 2.5 g of 

15 potassium dihydrogenphosphate, 0.75 g of magnesium sulfate heptahydrate, 50 mg of iron sulfate heptahydrate, 13 
mg of manganese sulfate pentahydrate, 50 mg of calcium chloride, 6.3 mg of copper sulfate pentahydrate, 1.3 mg of 
zinc sulfate heptahydrate, 5 mg of nickel chloride hexahydrate, 1 .3 mg of cobalt chloride hexahydrate, 1 .3 mg of am- 
monium molybdenate tetrahydrate, 14 mg of nicotinic acid, 23 mg of p-alanine, 7 mg of thiamin hydrochloride, and 
0.42 mg of biotin to 1 liter of water) contained in a 5 1 jar fermenter and cultured therein at 32°C, 1 wm and 800 rpm 

20 while controlling the pH to 7.0 with aqueous ammonia. When glucose in the medium had been consumed, a glucose 
feeding solution (medium prepared by adding 400 g glucose and 45 g of ammonium chloride to 1 liter of water) was 
continuously added. The addition of feeding solution was carried out at a controlled speed so as to maintain the dis- 
solved oxygen concentration within a range of 0.5 to 3 ppm. After culturing for 29 hours, the culture was terminated. 
The cells were separated from the culture medium by centrifugation and then L-lysine hydrochloride in the supernatant 

25 was quantified by high performance liquid chromatography (HPLC). The results are shown in Table 2 below. 



Table 2 


Strain 


L-Lysine hydrochloride yield (g/l) 


ATCC 13032 


0 


HD-1 


8 


No. 58 


45 


No. 58pyc 


51 



[0386] As is apparent from the results shown in Table 2, the lysine productivity was improved by introducing the 
mutation, Val59Ala, in the horn gene or the mutation, Pro458Ser, in the pyc gene. Accordingly, it was found that the 
mutations are both effective mutations relating to the production of lysine. Strain, AHP-3, in which the mutation, 
Val59Ala, in the horn gene and the mutation, Pro458Ser, in the pyc gene have been introduced into the wild type ATCC 
13032 strain together with the mutation, Thr331 lie in the lysC gene has been deposited on December 5, 2000, in 
National Institute of Bioscience and Human Technology, Agency of Industrial Science and Technology (Higashi 1-1-3, 
Tsukuba-shi, Ibaraki, Japan) as PERM BP-7382. 



Example 3 

45 Reconstruction of lysine-producing strain based on genome information 

[0387] The lysine-producing mutant B-6 strain {Appl. Microbiol. BiotechnoL, 32. 269-273 (1989)), which has been 
constructed by multiple round random mutagenesis with NTG and screening from the wild type ATCC 13032 strain, 
produces a remarkably large amount of lysine hydrochloride when cultured in a jar at 32°C using glucose as a carbon 
source. However, since the fermentation period is long, the production rate is less than 2. 1 g/l/h. Breeding to reconstitute 
only effective mutations relating to the production of lysine among the estimated at least 300 mutations introduced into 
the B-6 strain in the wild type ATCC 1 3032 strain was performed. 

(1 ) Identification of mutation point and effective mutation by comparing the gene nucleotide sequence of the B-6 strain 
with that of the ATCC 1 3032 strain 

[0388] As described above, the nucleotide sequences of genes derived from the B-6 strain were compared with the 
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corresponding nucleotide sequences of the ATCC 13032 strain genome represented by SEQ ID NOS:1 to 3501 and 
analyzed to identify many mutation points accumulated in the chromosome of the B-6 strain. Among these, a mutation, 
Val591Ala, in horn, a mutation, Thr311lle, in lysQ a mutation, Pro458Ser. in pyc and a mutation, Ala213Thr, in zwf 
were specified as effective mutations relating to the production of lysine. Breeding to reconstitute the 4 mutations in 
5 the wild type strain and for constructing of an industrially important lysine-producing strain was carried out according 
. to the method shown below. 

(2) Construction of plasmid for gene replacement having mutated gene 

10 [0389] The plasmid for gene replacement, pChom59, having the mutated horn gene and the plasmid for gene re- 
placement, pCpyc458, having the mutated pyc gene were prepared in the above Example 2(2). Plasmids for gene 
replacement having the mutated lysC and zwf were produced as described below. 

[0390] The lysC and zvvf having mutation points were amplified by PGR, and inserted into a plasmid for gene re- 
placement, pCES30, according to the TA cloning method described in Example 2(2) (Bio Experiment Illustrated, Vol. 3). 

15 [0391] Separately, chromosomal DNA was prepared from the lysine-producing B-6 strain according to the above 
method of Saito etal. Using the chromosomal DNA as a template, PGR was carried out with Pfu turbo DNA polymerase 
(manufactured by Stratagene). In the mutated //sCgene, the DNAs having the nucleotide sequences represented by 
SEQ ID NOS:7006 and 7007 were used as the primer set. In the mutated zwf gene, the DNAs having the nucleotide 
sequences represented by SEQ ID NOS:7008 and 7009 as the primer set. The resulting PGR product was subjected 

20 to agarose gel electrophoresis, and extracted and purified using GENEGLEAN Kit (manufactured by BIO 101). Then, 
the PGR product was allowed to react in the presence of Taq DNA polymerase (manufactured by Roche Diagnostics) 
and dATP at 72'*G for 10 minutes so that a nucleotide, adenine (A), was added to the 3'-end. 

[0392] The above pCES30 T vector fragment and the mutated lysCgene (1.5 kb) or mutated zwf gene (2.3 kb) to 
which the nucleotide A had been added of the PGR product were concentrated by extraction with phenol/chloroform 

25 and precipitation with ethanol, and then ligated using Ligation Kit ver. 2. The ligation products were introduced into the 
ATGG 13032 strain according to the electroporation method, and cultured on BYG agar medium containing 25 ng/ml 
kanamycin at 30''C for 2 days to obtain kanamycin-resistant transformants. Each of the resulting transformants was 
cultured overnight in BYG liquid medium containing 25 iig/ml kanamycin, and a plasmid was extracted from the culturing 
solution medium according to the alkali SDS method. As a result of digestion analysis using restriction enzymes, it was 

30 confirmed that the plasmid had a structure in which the 1 .5 kb or 2.3 kb DNA fragment had been inserted into pGES30. 
The plasmids thus constructed were named respectively pGlysG311 and pGzwf213. 

(3) Introduction of mutation, Thr311lle, in /ysC into one point mutant HD-1 

35 [0393] Since the one mutation point mutant HD-1 in which the mutation, Va!59Ala, in horn was introduced into the 
wild type ATGG 13032 strain had been obtained in Example 2(2), the mutation, Thr311 lie, in /ysC was introduced into 
the HD-1 strain using pClysC311 produced in the above (2) according to the gene replacement method described in 
Example 2(2). PGR was carried out using chromosomal DNA of the resulting strain and, as the primer set. DNAs having 
the nucleotide sequences represented by SEQ ID NOS:7006 and 7007 in the same manner as in Example 2(2). As a 

40 result of the fact that the nucleotide sequence of the PGR product was determined in the usual manner, it was confirmed 
that the strain which was named AHD-2 was a two point mutant having the mutated /ysC gene in addition to the mutated 
fiom gene. 

(4) Introduction of mutation, Pro458Ser, in pyc into two point mutant AHD-2 

45 

[0394] The mutation, Pro458Ser, in pyc was introduced into the AHD-2 strain using the pGpyc458 produced in Ex- 
ample 2(2) by the gene replacement method described in Example 2(2). PGR was carried out using chromosomal 
DNA of the resulting strain and, as the primer set, DNAs having the nucleotide sequences represented by SEQ ID 
NOS:7004 and 7005 in the same manner as in Example 2(2). As a result of the fact that the nucleotide sequence of 
50 the PGR product was determined in the usual manner, it was confirmed that the strain which was named AHD-3 was 
a three point mutant having the mutated pyc gene in addition to the mutated fiom gene and lysC gene. 

(5) Introduction of mutation, Ala213Thr, in zivf into three point mutant AHP-3 

55 [0395] The mutation, Ala213Thr, in zwf was introduced into the AHP-3 strain using the pGzwf458 produced in the 
above (2) by the gene replacement method described in Example 2(2). PGR was carried out using chromosomal DNA 
of the resulting strain and, as the primer set, DNAs having the nucleotide sequences represented by SEQ ID NOS: 
7008 and 7009 in the same manner as in Example 2(2). As a result of the fact that the nucleotide sequence of the PGR 
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product was determined in the usual manner, it was confirmed that the strain which was named APZ-4 was a four point 
mutant having the mutated zwf gene In addition to the mutated horn gene, lysC gene and pyc gene. 

(6) Lysine production test on HD-1, AHD-2, AHP-3 and APZ-4 strains 

5 

[0396] The HD-1, AHD-2, AHP-3 and APZ-4 strains obtained above were subjected to a culture test in a 5 I jar 
fermenter in accordance with the method of Example 2(3). 
[0397] Table 3 shows the results. 

10 Table 3 



Strain 


L-Lysine hydrochloride (g/l) 


Productivity (g/l/h) 


HD-1 


8 


0.3 


AHD-2 


73 


2.5 


AHP-3 


80 


2.8 


APZ-4 


86 


3.0 



[0398] Since the tysine-producing mutant B-6 strain which has been bred based on the random mutation and selection 
shows a productivity of less than 2.1 g/l/h, the APZ-4 strain showing a high productivity of 3.0 g/l/h is useful in industry. 

20 

(7) Lysine fermentation by APZ-4 strain at high temperature 

[0399] The APZ-4 strain, which had been reconstructed by introducing 4 effective mutations into the wild type strain, 
was subjected to the culturing test in a 5 1 jar fermenter in the same manner as in Example 2(3), except that the culturing 
temperature was changed to 40°C. 
[0400] The results are shown in Table 4. 



Table 4 



Temperature (°C) 


L- Lysine hydrochloride (g/l) 


Productivity (g/l/h) 


32 


86 


3.0 


40 


95 


3.3 



[0401 ] As is apparent from the results shovv'n in Table 4, the lysine hydrochloride titer and productivity In culiurlng at 
^5 a high temperature of 40**C comparable to those at 32'=*C were obtained. In the mutated and bred lysine-producing B- 
6 strain constructed by repeating random mutation and selection, the growth and the lysine productivity are lowered 
at temperatures exceeding 34'*C so that lysine fermentation cannot be carried out, whereas lysine fermentation can 
be carried out using the APZ-4 strain at a high temperature of 40**C so that the load of cooling is greatly reduced and 
it is industrially useful. The lysine fermentation at high temperatures can be achieved by reflecting the high temperature 
adaptability inherently possessed by the wild type strain on the APZ-4 strain. 

[0402] As demonstrated In the reconstruction of the lysine-producing strain, the present invention provides a novel 
breeding method effective for eliminating the problems in the conventional mutants and acquiring Industrially advan- 
tageous strains. This methodology which reconstitutes the production strain by reconstituting the effective mutation Is 
an approach which is efficiently carried out using the nucleotide sequence information of the genome disclosed in the 
present invention, and its effectiveness was found for the first time in the present Invention. 

Example 4 

Production of DNA microarray and use thereof 

50 

[0403] A DNA microarray was produced based on the nucleotide sequence Information of the ORF deduced from 
the full nucleotide sequences oi.Corynebacterium glutamicum ATCC 13032 using software, and genes of which ex- 
pression is fluctuated depending on the carbon source during culturing were searched. 

(1) Production of DNA microarray 

[0404] Chromosomal DNA was prepared from Corynebacterium glutamicum ATCC 1 3032 by the method of Saito et 
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7013 


were 
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for 


the 
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for 


the 


am- 
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for 
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am- 
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for 


the 
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7057 

and 

7059 


were 
were 


used 
used 


for 
for 


the 
the 


am- 
am- 



the ampli- 



a/. ( Biochem. Biophys. Acta, 72: 619 (1963)). Based on 24 genes having the nucleotide sequences represented by 
SEQ ID NOS:207, 3433, 281, 3435, 3439, 765, 3445, 1226, 1229, 3448, 3451, 3453, 3455, 1743, 3470, 2132, 3476, 
3477, 3485, 3488, 3489, 3494. 3496, and 3497 from the ORFs shown in Table 1 deduced from the full genome nucle- 
otide sequence of Corynebacterium giutamicum ATCC 13032 using software and the nucleotide sequence of rabbit 
globin gene (GenBank Accession No, V00882) used as an internal standard, oligo DNA primers for PGR amplification 
represented by SEQ ID NOS:7010 to 7059 targeting the nucleotide sequences of the genes were synthesized in a 
usual manner. 

[0405] As the oligo DNA primers used for the PGR, 

[0406] DNAs having the nucleotide sequence represented by SEQ ID NOS:7010 and 7011 were used for 
fication of the DNA having the nucleotide sequence represented by SEQ ID NO:207, 
[0407] DNAs having the nucleotide sequence represented by SEQ ID NOS:7012 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3433, 
[0408] DNAs having the nucleotide sequence represented by SEQ ID NOS:7014 and 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:281, 
[0409] DNAs having the nucleotide sequence represented by SEQ ID NOS:7016 and 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3435, 
[0410] DNAs having the nucleotide sequence represented by SEQ ID NOS:7018 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3439, 
[0411] DNAs having the nucleotide sequence represented by SEQ ID NOS:7020 anc 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:765, 
[0412] DNAs having the nucleotide sequence represented by SEQ ID NOS:7022 anc 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3445, 
[0413] DNAs having the nucleotide sequence represented by SEQ ID NOS:7024 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:1226, 
[0414] DNAs having the nucleotide sequence represented by SEQ ID NOS:7026 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO: 1229, 
[0415] DNAs having the nucleotide sequence represented by SEQ ID NOS:7028 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3448, 
[0416] DNAs having the nucleotide sequence represented by SEQ ID NOS:7030 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3451 , 
[0417] DNAs having the nucleotide sequence represented* by SEQ ID NOS:7032 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3453, 
[0418] DNAs having the nucleotide sequence represented by SEQ ID NOS:7034 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3455, 
[0419] DNAs having the nucleotide sequence represented by SEQ ID NOS:7036 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:1743. 
[0420] DNAs having the nucleotide sequence represented by SEQ ID NOS:7038 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3470, 
[0421] DNAs having the nucleotide sequence represented by SEQ ID NOS:7040 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:2132, 
[0422] DNAs having the nucleotide sequence represented by SEQ ID NOS:7042 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3476, 
[0423] DNAs having the nucleotide sequence represented by SEQ ID NOS:7044 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3477, 
[0424] DNAs having the nucleotide sequence represented by SEQ ID NOS:7046 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3485, 
[0425] DNAs having the nucleotide sequence represented by SEQ ID NOS:7048 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3488, 
[0426] DNAs having the nucleotide sequence represented by SEQ ID NOS:7050 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3489, 
[0427] DNAs having the nucleotide sequence represented by SEQ ID NOS:7052 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3494, 
[0428] DNAs having the nucleotide sequence represented by SEQ ID NOS:7054 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3496, 
[0429] DNAs having the nucleotide sequence represented by SEQ ID NOS:7056 
plification of the DNA having the nucleotide sequence represented by SEQ ID NO:3- 
[0430] DNAs having the nucleotide sequence represented by SEQ ID NOS:7058 
plification of the DNA having the nucleotide sequence of the rabbit globin gene, 
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as the respective primer set. 

[0431] The PGR was carried for 30 cycles with each cycle consisting of 15 seconds at 95°C and 3 minutes at 68°C 
using a thermal cycler (GeneAmp PGR system 9600, manufactured by Perkin Elmer), TaKaRa EX-Taq (manufactured 
by Takara Shuzo), 100 ng of the chromosomal DNA and the buffer attached to the TaKaRa Ex-Taq reagent. In the case 

5 of the rabbit globin gene, a single-stranded cDNA which had been synthesized from rabbit globin mRNA (manufactured 
by Life Technologies) according to the manufacture's instructions using a reverse transcriptase RAV-2 (manufactured 
by Takara Shuzo). The PGR product of each gene thus amplified was subjected to agarose gel electrophoresis and 
extracted and purified using QIAquick Gel Extraction Kit (manufactured by QIAGEN). The purified PGR product was 
concentrated by precipitating it with ethanol and adjusted to a concentration of 200 ng/^il. Each PGR product was 

10 spotted on a slide glass plate (manufactured by Matsunami Glass) having MAS coating in 2 runs using GTMASS 
SYSTEM (manufactured by Nippon Laser & Electronics Lab.) according to the manufacture's instructions. 

(2) Synthesis of fluorescence labeled cDNA 

15 [0432] The ATCG 13032 strain was spread on BY agar medium (medium prepared by adding 20 g of peptone (man- 
ufactured by Kyokuto Pharmaceutical), 5 g of yeast extract (manufactured by Difco), and 16 g of Bactoagar (manufac- 
tured by Difco) to in 1 liter of water and adjusting its pH to 7.2) and cultured at 30'*G for 2 days. Then, the cultured 
strain was further inoculated into 5 ml of BY liquid medium and cultured at SQ^'C overnight. Then, the cultured strain 
was further inoculated into 30 ml of a minimum medium (medium prepared by adding 5 g of ammonium sulfate, 5 g of 

20 urea, 0.5 g of monopotassium dihydrogenphosphate, 0.5 g of dipotassium monohydrogenphosphate, 20.9 g of mor- 
pholinopropanesulfonic acid, 0.25 g of magnesium sulfate heptahydrate, 10 mg of calcium chloride dihydrate, 10 mg 
of manganese sulfate monohydrate, 10 mg of ferrous sulfate heptahydrate, 1 mg of zinc sulfate heptahydrate, 0.2 mg 
copper sulfate, and 0.2 mg biotin to 1 liter of water, and adjusting its pH to 6.5) containing 110 mmol/l glucose or 200 
mmol/l ammonium acetate, and cultured in an Erienmyer flask at 30° to give 1.0 of absorbance at 660 nm. After the 

25 cells were prepared by centrifuging at 4°G and 5.000 rpm for 10 minutes, total RNA was prepared from the resulting 
cells according to the method of Bormann et al. ( Molecular Microbiology, &. 31 7-326 (1992)). To avoid contamination 
with DNA, the RNA was treated with Dnasel (manufactured by Takara Shuzo) at 37°G for 30 minutes and then further 
purified using Qiagen RNeasy MiniKit (manufactured by QIAGEN) according to the manufacture's instructions. To 30 
fig of the resulting total RNA. 0.6 \i\ of rabbit globin mRNA (50 ng/|il, manufactured by Life Technologies) and 1 \i\ of 

30 a random 6 mer primer (500 ng/^il, manufactured by Takara Shuzo) were added for denaturing at 65°C for 1 0 minutes, 
followed by quenching on ice. To the resulting solution, 6 |il of a buffer attached to Superscript II (manufactured by 
Lifetechnologies), 3 \i\ of 0.1 mol/l DTT, 1 .5 ^il of dNTPs (25 mmol/l dATP, 25 mmol/l dGTP, 25 mmol/l dGTP, 10 mmol/ 
1 dTTP), 1.5 nl of Gy5-dUTP or Cy3-dUTP (manufactured by NEN) and 2 |al of Superscript II were added, and allowed 
to stand at 25°G for 10 minutes and then at 42'*C for 110 minutes. The RNA extracted from the cells using glucose as 

35 the carbon source and the RNA extracted from the cells using ammonium acetate were labeled with Gy5-dUTP and 
Gy3-dUTP, respectively. After the fluorescence labeling reaction, the RNA was digested by adding 1 .5 ftl of 1 mol/l 
sodium hydroxide-20 mmol/l EDTA solution and 3.0 |il of 10% SDS solution, and allowed to stand at 65^C for 10 
minutes. The two cDNA solutions after the labeling were mixed and purified using Qiagen PGR purification Kit (man- 
ufactured by QIAGEN) according to the manufacture's instructions to give a volume of 10 fil. 

40 

(3) Hybridization 

[0433] UltraHyb (1 1 0 ^il) (manufactured by Ambion) and the fluorescence-labeled cDN A solution (1 0 ^il) were mixed 
and subjected to hybridization and the subsequent washing of slide glass using GeneTAG Hybridization Station (man- 
45 ufactured by Genomic Solutions) according to the manufacture's instructions. The hybridization was carried out at 
SO'^G, and the washing was carried out at 25°G. 

(4) Fluorescence analysis 

50 [0434] The fluorescence amount of each DNA array having the fluorescent cDN A hybridized therewith was measured 
using ScanArray 4000 (manufactured by GSI Lumonics). 

[0435] Table 5 shows the Gy3 and Gy5 signal intensities of the genes having been corrected on the basis of the data 
of the rabbit globin used as the internal standard and the Gy3/Gy5 ratios. 

55 Table 5 



SEQ ID NO 


Gy3 intensity 


Gy5 intensity 


Gy3/Gy5 


207 


5248 


3240 


1.62 
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Table 5 (continued) 



10 



20 



25 



SEQ ID NO 


Cy3 intensity 


Cy5 intensity 


Cv3/Cy5 


3433 


2239 


2694 


0.83 


281 


2370 


2595 


0.91 


3435 


2566 


2515 


1 .02 


3439 


5597 


6944 


0.81 


765 


6134 


4943 


1.24 


3455 


1169 


1284 


0.91 


1226 


1301 


1493 


0.87 


1229 


1168 


1131 


1.03 


3448 


1187 


1594 


0.74 


3451 


2845 


3859 


0.74 


3453 


3498 


1705 


2.05 


3455 


1491 


1144 


1.30 


1743 


1972 


1841 


1.07 


3470 


4752 


3764 


1.26 


2132 


1173 


1085 


1.08 


3476 


1847 


1420 


1.30 


3477 


1284 


1164 


1.10 


3485 


4539 


8014 


0.57 


3488 


34289 


1398 


24.52 


3489 


43645 


1497 


29.16 


3494 


3199 


2503 


1.28 


3496 


3428 


2364 


1.45 


3497 


3848 


3358 


1.15 



[0436] The ORF function data estimated by using software were searched for SEQ ID NOS:3488 and 3489 showing 
remarkably strong Cy3 signals. As a result, it was found that SEQ ID NOS:3488 and 3489 are a maleate synthase 
gene and an isocitrate lyase gene, respectively. It is known that these genes are transcriptionally induced by acetic 
acid in Corynebactehum glutam/cum {Archives of Microbiology, 168: 262-269 (1997)). 

[0437] As described above, a gene of which expression is fluctuates could be discovered by synthesizing appropriate 
oligo DNA primers based on the ORF nucleotide sequence information deduced from the full genomic nucleotide 
sequence information of Corynebactehum glutamicum ATCC 1 3032 using software, amplifying the nucleotide sequenc- 
es of the gene using the genome DNA of Coryniabacterium glutamicum as a template in the PGR reaction, and thus 
producing and using a DNA microarray. 

[0438] This Example shows that the expression amount can be analyzed using a DNA microarray in the 24 genes. 
On the other hand, the present DNA microarray techniques make it possible to prepare DNA microarrays having thereon 
several thousand gene probes at once. Accordingly, it is also possible to prepare DNA microarrays having thereon all 
of the ORF gene probes deduced from the full genomic nucleotide sequence of Corynebactehum glutamicum ATCC 
1 3032 determined by the present invention, and analyze the expression profile at the total gene level of Corynebac- 
tehum glutamicum using these arrays. 

Example 5 

Homology search using Corynebactehum glutamicum genome sequence 
(1) Search of adenosine deaminase 

[0439] The amino acid sequence (ADD__ECOLI) of Escherichia co// adenosine deaminase was obtained from Swiss- 
prot Database as the amino acid sequence of the protein of which function had been confirmed as adenosine deaminase 
(EC3.5.4.4). By using the full length of this amino acid sequence as a query, a homology search was carried out on a 
nucleotide sequence database of the genome sequence of Corynebacterium giutamicum or a database of the amino 
acids in the ORF region deduced from the genome sequence using FASTA program (Proc. Natl. Acad. Set. ISA, 85: 
2444-2448 (1 988)). A case where E-value was le-'io or less was judged as being significantly homologous. As a result, 
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no sequence significantly homologous with the Escherichia coli adenosine deaminase was found in the nucleotide 
sequence database of the genome sequence of Corynebacterium glutamicum or the database of the amino acid se- 
quences in the ORF region deduced from the genome sequence. Based on these results, it Is assumed that Coryne- 
bacterium giutamicum contains no ORF having adenosine deaminase activity and thus has no activity of converting 
5 adenosine Into inosine. 

(2) Search of glycine cleavage enzyme 

[0440] The sequences (GCSP.ECOLI, GCST_ECOLI and GCSH_ECOLI) of glycine decarboxylase, aminomethyl 
10 transferase and an aminomethyl group carrier each of which is a component of Escherichia coii glycine cleavage 
enzyme as the amino acid sequence of the protein, of which function had been confirmed as glycine cleavage enzyme 
(EC2.1.2.10), were obtained from Swiss-prot Database. 

[0441 ] By using these full-length amino acid sequences as a query, a homology search was carried out on a nucleotide 
sequence database of the genome sequence of Corynebacterium giutamicum or a database of the ORF amino acid 

15 sequences deduced from the genome sequence using FASTA program. A case where E-value was le-^o qx less was 
judged as being significantly homologous. As a result, no sequence significantly homologous with the glycine decar- 
boxylase, the aminomethyl transferase or the aminomethyl group carrier each of which is a component of Escherichia 
CO//' glycine cleavage enzyme, was found in the nucleotide sequence database of the genome sequence of Coryne- 
bacterium giutamicum or the database of the ORF amino acid sequences estimated from the genome sequence. Based 

20 on these results, it is assumed that Corynebacterium giutamicum contains no ORF having the activity of glycine de- 
carboxylase, aminomethyl transferase or the aminomethyl group carrier and thus has no activity of the glycine cleavage 
enzyme. 

(3) Search of IMP dehydrogenase 

25 

[0442] The amino acid sequence (IMDH ECOLl) of Escherichia coli MP dehydrogenase as the amino acid sequence 
of the protein, of which function had been confirmed as IMP dehydrogenase (EC1 .1 .1 .205), was obtained from Swiss- 
prot Database. By using the full length of this amino acid sequence as a query, a homology search was carried out on 
a nucleotide sequence database of the genome sequence of Corynebacterium giutamicum or a database of the ORF 

30 amino acid sequences predicted from the genome sequence using FASTA program. A case where E-value was le ''^ 
or less was judged as being significantly homologous. As a result, the amino acid sequences encoded by two ORFs, 
namely, an ORF positioned in the region of the nucleotide sequence No. 615336 to 616853 (or ORF having the nucle- 
otide sequence represented by SEQ ID NO:672) and another ORF positioned in the region of the nucleotide sequence 
No. 616973 to 618094 (or ORF having the nucleotide sequence represented by SEQ ID NO: 674) were significantly 

35 homologous with the ORFs of Escherichia coli IMP dehydrogenase. By using the above-described predicted amino 
acid sequence as a query in order to examine the similarity of the amino acid sequences encoded by the ORFs with 
IMP dehydrogenases of other organisms in greater detail, a search was carried out on GenBank (http://www.ncbi.nlm. 
nih.gov/) nr-aa database (amino acid sequence database constructed on the basis of GenBankODS translation prod- 
ucts, PDB database, Swiss-Prot database, PIR database, PRF database by eliminating duplicated registrations) using 

40 BLAST program. As a result, both of the two amino acid sequences showed significant homologies with IMP dehdy- 
rogenases of other organisms and clearly higher homologies with IMP dehdyrogenases than with amino acid sequences 
of other proteins, and thus, it was assumed that the two ORFs would function as IMP dehydrogenase. Based on these 
results, it was therefore assumed that Corynebacterium giutamicum has two ORFs having the IMP dehydrogenase 
activity. 

45 

Example 6 

Proteome analysis of proteins derived from Corynebacterium giutamicum 

50 ( 1 ) Preparations of proteins derived from Corynebacterium giutamicum ATCC 1 3032, FERM BP-71 34 and PERM BP- 
158 

[0443] Culturing tests of Corynebacterium giutamicum ATCC 1 3032 (wild type strain), Corynebacterium giutamicum 
FERM BP-71 34 (lysine-producing strain) and Corynebacterium giutamicum (FERM BP-158, lysine-highly producing 
55 strain) were carried out in a 5 1 jar fermenter according to the method in Example 2(3). The results are shown in Table 6. 
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Table 6 



30 



Strain 


L-Lysine yield (g/l) 


ATCC 13032 


0 


PERM BP-7134 


46 


PERM BP-158 


60 



[0444] After culturing, cells of each strain were recovered by centrifugation. These cells were washed with Tris-HCI 
^0 buffer (10 mmol/l Tris-HCI, pH 6.5, 1.6 mg/ml protease inhibitor (COMPLETE; manufactured by Boehringer Mannheim)) 
three times to give washed cells which could be stored under freezing at -80°C. The freeze-stored cells were thawed 
before use, and used as washed celts. 

[0445] The washed cells described above were suspended in a disruption buffer (1 0 mmol/l Tris-HCI, pH 7.4, 5 mmol/ 
I magnesium chloride, 50 mg/l RNase, 1.6 mg/ml protease inhibitor (COMPLETE: manufactured by Boehringer Man- 
'5 nheim)), and disrupted with a disrupter (manufactured by Brown) under cooling. To the resulting disruption solution, 
DNase was added to give a concentration of 50 mg/l, and allowed to stand on ice for 10 minutes. The solution was 
centrifuged (5,000 x g, 15 minutes, 4**C) to remove the undisrupted cells as the precipitate, and the supernatant was 
recovered. 

[0446] To the supernatant, urea was added to give a concentration of 9 mol/l, and an equivalent amount of a lysis 
20 buffer (9.5 mol/l urea, 2% NP-40, 2% Ampholine, 5% mercaptoethanol, 1.6 mg/m! protease inhibitor (COMPLETE; 
manufactured by Boehringer Mannheim) was added thereto, followed by thoroughly stirring at room temperature for 
dissolving. 

[0447] After being dissolved, the solution was centrifuged at 12,000 x g for 15 minutes, and the supernatant was 
recovered. 

25 [0448] To the supernatant, ammonium sulfate was added to the extent of 80% saturation, followed by thoroughly 
stirring for dissolving. 

[0449] After being dissolved, the solution was centrifuged (16,000 x g, 20 minutes, 4°C), and the precipitate was 
recovered. This precipitate was dissolved in the lysis buffer again and used in the subsequent procedures as a protein 
sample. The protein concentration of this sample was determined by the method for quantifying protein of Bradford. 



(2) Separation of protein by two dimensional electrophoresis 



[0450] The first dimensional electrophoresis was carried out as described below by the isoelectric electrophoresis 
method. 

35 [0451] A molded dry IPG strip gel (pH 4-7, 13 cm, Immobiline DryStrips; manufactured by Amersham Pharmacia 
Biotech) was set in an electrophoretic apparatus (Multiphor II or IPGphor; manufactured by Amersham Pharmacia 
Biotech) and a swelling solution (8 mol/l urea, 0.5% Triton X-100, 0.6% dithiothreitol, 0.5% Ampholine, pH 3-10) was 
packed therein, and the gel was allowed to stand for swelling 12 to 16 hours. 

[0452] The protein sample prepared above was dissolved in a sample solution (9 mol/l urea, 2% CHAPS, 1% dithi- 
40 othreitol, 2% Ampholine, pH 3-10), and then about 1 00 to 500 (in terms of protein) portions thereof were taken and 
added to the swollen IPG strip gel. 

[0453] The electrophoresis was carried out in the 4 steps as defined below under controlling the temperature to 20°C: 

step 1: 1 hour under a gradient mode of 0 to 500V; 
45 step 2: 1 hour under a gradient mode of 500 to 1 ,000 V; 

step 3: 4 hours under a gradient mode of 1,000 to 8,000 V; and 
step 4: 1 hour at a constant voltage of 8,000 V. 

[0454] After the isoelectric electrophoresis, the IPG strip gel was put off from the holder and soaked in an equilibration 
5(> buffer A (50 mmol/l Tris-HCI, pH 6.8, 30% glycerol, 1% SDS, 0.25% dithiothreitol) for 15 minutes and another equili- 
bration buffer B (50 mmol/l Tris-HCI, pH 6.8, 6 mol/l urea, 30% glycerol, 1% SDS, 0.45% iodo acetamide) for 15 minutes 
to sufficiently equilibrate the gel. 

[0455] After the equilibrium, the IPG strip gel was lightly rinsed in an SDS electrophoresis buffer (1 .4% glycine, 0.1% 
SDS, 0.3% Tris-HCI, pH 8.5), and the second dimensional electrophoresis depending on molecular weight was carried 
55 out as described below to separate the proteins. 

[0456] Specifically, the above IPG strip gel was closely placed on 1 4% polyacrylamide slub gel (1 4% polyacrylamide, 
0.37% bisacrylamide, 37.5 mmol/l Tris-HCI, pH 8.8, 0.1% SDS, 0.1% TEMED, 0.1% ammonium persulfate) and sub- 
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jected to electrophoresis under a constant voltage of 30 mA at 20*C for 3 hours to separate the proteins. 

(3) Detection of protein spot 

5 [0457] Coomassie staining was performed by the method of Gorg et al. {Electrophoresis, 9. 531-546 (1 988)) for the 
slub gel after the second dimensional electrophoresis. Specifically, the slub gel was stained under shaking at 2&'0 for 
about 3 hours, the excessive coloration was removed with a decoloring solution, and the gel was thoroughly washed 
with distilled water. 

[0458] The results are shown In Fig. 2. The proteins derived from the ATCC 13032 strain (Fig. 2A), FERM BP-7134 
10 strain (Fig. 2B) and FERM BP-158 strain (Fig. 2C) could be separated and detected as spots. 

(4) In-gel digestion of detected protein spot 

[0459] The detected spots were each cut out from the gel and transferred Into siliconized tube, and 400 |xl of 100 
15 mmol/1 ammonium bicarbonate : acetonltrile solution (1:1, v/v) was added thereto, followed by shaking overnight and 
freeze-drled as such. To the dried gel, 1 0 ^1 of a lysylendopeptidase (LysC) solution (manufactured by WAKO, prepared 
with 0.1% SDS-containIng 50 mmol/l ammonium bicarbonate to give a concentration of 100 ng/|xl) was added and the 
gel was allowed to stand for swelling at O^'C for 45 minutes, and then allowed to stand at 37°C for 16 hours. After 
removing the LysC solution, 20 [xl of an extracting solution (a mixture of 60% acetonltrile and 5% formic acid) was 
20 added, followed by ultrasonication at room temperature for 5 minutes to disrupt the gel. After the disruption, the extract 
was recovered by centrlfugatlon (12,000 rpm, 5 minutes, room temperature). This operation was repeated twice to 
recover the whole extract. The recovered extract was concentrated by centrlfugatlon /n vacuoXo halve the liquid volume. 
To the concentrate, 20 ^1 of 0.1% trifluoroacetic acid was added, followed by thoroughly stirring, and the mixture was 
subjected to desalting using ZIpTIp (manufactured by Mllllpore). The protein absorbed on the carriers of ZipTIp was 
25 eluted with 5 |il of a-cyano-4-hydroxycinnamlc acid for use as a sample solution for analysis. 

(5) Mass spectrometry and amino acid sequence analysis of protein spot with matrix assisted laser desorption ionization 
time of flight mass spectrometer (MALDI-TOFMS) 

30 [0460] The sample solution for analysis was mixed in the equivalent amount with a solution of a peptide mixture for 
mass calibration (300 nmol/l Angiotensin II, 300 nmol/l Neurotensin, 150 nmol/l ACTHcllp 18^39, 2.3 ^mol/l bovine 
insulin 8 chain), and 1 ^il of the obtained solution was spotted on a stainless probe and crystallized by spontaneously 
drying. 

[0461] As measurement instruments, REFLEX MALDI-TOF mass spectrometer (manufactured by Bruker) and an 
35 N2 laser (337 nm) were used in combination. 

[0462] The analysis by PMF (peptide-mass finger printing) was carried out using Integration spectra data obtained 
by measuring 30 times at an accelerated voltage of 19.0 kV and a detector voltage of 1.50 kV under reflector mode 
conditions. Mass calibration was carried out by the internal standard method. 

[0463] The PSD (post-source decay) analysis was carried out using integration spectra obtained by successively 
40 altering the reflection voltage and the detector voltage at an accelerated voltage of 27.5 kV. 

[0464] The masses and amino acid sequences of the peptide fragments derived from the protein spot after digestion 
were thus determined. 

(6) Identification of protein spot 

45 

[0465] From the amino add sequence information of the digested peptide fragments derived from the protein spot 
obtained in the above (5), ORFs corresponding to the protein were searched on the genome sequence database of 
Corynebacterium glutamicum ATCO 13032 as constructed in Example 1 to identify the protein. 
[0466] The identification of the protein was carried out using MS-Fit program and MS-Tag program of intranet protein 
50 prospector. 

(a) Search and identification of gene encoding high-expression protein 

[0467] In the proteins derived from Corynebacterium glutamicum ATCC 13032 showing high expression amounts in 
55 GBB-stainIng shown In Fig. 2A, the proteins corresponding to Spots-1 , 2, 3, 4 and 5 were identified by the above method. 
[0468] As a result, it was found that Spot-1 corresponded to enolase which was a protein having the amino acid 
sequence of SEQ ID NO:4585; Spot-2 corresponded to phosphoglycelate kinase which was a protein having the amino 
acid sequence of SEQ ID NO:5254; Spot-3 corresponded to glyceraldehyde-3-phosphate dehydrogenase which was 
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a protein having the amino acid sequence represented by SEQ ID NO:5255; Spot-4 corresponded to fructose bis- 
phosphate aldolase which was a protein having the amino acid sequence represented by SEQ ID NO:6543; and Spot- 
5 corresponded to those phosphate isomerase which was a protein having the amino acid sequence represented by 
SEQ ID NO:5252. 

5 [0469] These genes, represented by SEQ ID NOS:1085, 1754, 1775, 3043 and 1752 encoding the proteins corre- 
sponding to Spots-1, 2, 3, 4 and 5, respectively, encoding the known proteins are important in the central metabolic 
pathway for maintaining the life of the microorganism. Particularly, it is suggested that the genes of Spots-2, 3 and 5 
form an operon and a high-expression promoter is encoded in the upstream thereof (J. of EacterioL, 174: 6067-6086 
(1992)). 

10 [0470] Also, the protein corresponding to Spot-9 in Fig. 2 was identified in the same manner as described above, 
and it was found that Spot-9 was an elongation factor Tu which was a protein having the amino acid sequence repre- 
sented by SEQ ID No:6937, and that the protein was encoded by DNA having the nucleotide sequence represented 
by SEQ ID No:3437. 

[0471] Based on these results, the proteins having high expression level were identified by proteome analysis using 
15 the genome sequence database of Corynebacterium glutamicum constructed in Example 1 . Thus, the nucleotide se- 
quences of the genes encoding the proteins and the nucleotide sequences upstream thereof could be searched simul- 
taneously. Accordingly, it is shown that nucleotide sequences having a function as a high-expression promoter can be 
efficiently selected. 

20 (b) Search and identification of modified protein 

[0472] Among the proteins derived from Corynebacterium glutamicum PERM BP-7134 shown in Fig. 2B, Spots-6, 
7 and 8 were identified by the above method. As a result, these three spots all corresponded to catalase which was a 
protein having the amino acid sequence represented by SEQ ID NO:3785. 
25 [0473] Accordingly, all of Spots-6, 7 and 8 detected as spots differing in isoelectric mobility were all products derived 
from a catalase gene having the nucleotide sequence represented by SEQ ID No:285. Accordingly, it is shown that 
the catalase derived from Corynebacterium glutamicum PERM BP-7134 was modified after the translation. 
[0474] Based on these results, it is confirmed that various modified proteins can be efficiently searched by proteome 
analysis using the genome sequence database of Corynebacterium glutamicum constructed in Example 1 . 

30 

(c) Search and identification of expressed protein effective in lysine production 

[0475] It was found out that in Fig, 2A (ATCC 13032: wild type strain). Fig. 2B (PERM BP-7134: lysine-producing 
strain) and Fig. 2C (PERM BP-1 58: lysine-highly producing strain), the catalase corresponding to Spot-8 and the elon- 
35 gation factor Tu corresponding to Spot-9 as identified above showed the higher expression level with an increase in 
the lysine productivity. 

[0476] Based on these results, it was found that hopeful mutated proteins can be efficiently searched and identified 
in breeding aiming at strengthening the productivity of a target product by the proteome analysis using the genome 
sequence database of Corynebacterium g/ufam/cL/m constructed in Example 1. 
^0 [0477] Moreover, useful mutation points of useful mutants can be easily specified by searching the nucleotide se- 
quences (nucleotide sequences of promoter, ORP, or the like) relating to the identified proteins using the above data- 
base and using primers designed on the basis of the sequences. As a result of the fact that the mutation points are 
specified, industrially useful mutants which have the useful mutations or other useful mutations derived therefrom can 
be easily bred. 

45 [0478] While the invention has been described in detail and with reference to specific embodiments thereof, it will 
be apparent to one of skill in the art that various changes and modifications can be made therein without departing 
from the spirit and scope thereof. All references cited herein are incorporated in their entirety 



50 Claims 

1 . A method for at least one of the following: 

(A) identifying a mutation point of a gene derived from a mutant of a coryneform bacterium, 
55 (B) measuring an expression amount of a gene derived from a coryneform bacterium, 

(C) analyzing an expression profile of a gene derived from a coryneform bacterium, 

(D) analyzing expression patterns of genes derived from a coryneform bacterium, or 

(E) identifying a gene homologous to a gene derived from a coryneform bacterium. 
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said method comprising: 

(a) producing a polynucleotide array by adhering to a solid support at least two polynucleotides selected 
from the group consisting of first polynucleotides comprising the nucleotide sequence represented by any 

5 one of SEQ ID NOS:1 to 3501 , second polynucleotides which hybridize with the first polynucleotides under 

stringent conditions, and third polynucleotides comprising a sequence of 10 to 200 continuous bases of 
the first or second polynucleotides, 

(b) incubating the polynucleotide array with at least one of a labeled polynucleotide derived from a co- 
ryneform bacterium, a labeled polynucleotide derived from a mutant of the coryneform bacterium or a 

10 labeled polynucleotide to be examined, under hybridization conditions, 

(c) detecting any hybridization, and 

(d) analyzing the result of the hybridization. 

2. The method according to claim 1 , wherein the coryneform bacterium is a microorganism belonging to the genus 
15 Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

3. The method according to claim 2, wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, Corynebacterium 
acetogfutamicum, Corynebacterium callunae, Corynebacterium iierculis, Corynebacterium lilium, Corynebacteri- 

20 um melassecoia, Corynebacterium tiiermoaminogenes, and Corynebacterium ammoniagenes. 

4. The method according to claim 1 , wherein the polynucleotide derived from a coryneform bacterium, the polynuce- 
lotide derived from a mutant of the coryneform bacterium or the polynucleotide to be examined is a gene relating 
to the biosynthesis of at least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, 

25 an organic acid, and analogues thereof. 

5. The method according to claim 1 , wherein the polynucleotide to be examined is derived from Escherichia coli. 

6. A polynucleotide array, comprising: 

30 

at least two polynucleotides selected from the group consisting of first polynucleotides comprising the nucle- 
otide sequence represented by any one of SEQ ID N0S:1 to 3501 , second polynucleotides which hybridize 
with the first polynucleotides under stringent conditions, and third polynucleotides comprising 10 to 200 con- 
tinuous bases of the first or second polynucleotides, and 
35 a solid support adhered thereto. 

7. A polynucleotide comprising the nucleotide sequence represented by SEQ ID NO:1 or a polynucleotide having a 
homology of at least 80% with the polynucleotide. 

40 8. A polynucleotide comprising any one of the nucleotide sequences represented by SEQ ID NOS:2 to 3431, or a 
polynucleotide which hybridizes with the polynucleotide under stringent conditions. 

9. A polynucleotide encoding a polypeptide having any one of the amino acid sequences represented by SEQ ID 
NOS:3502 to 6931 , or a polynucleotide which hybridizes therewith under stringent conditions. 

45 

1 0. A polynucleotide which is present in the 5' upstream or 3' downstream of a polynucleotide comprising the nucleotide 
sequence of any one of SEQ ID NOS:2 to 3431 in a whole polynucleotide comprising the nucleotide sequence 
represented by SEQ ID NO:1 , and has an activity of regulating an expression of the polynucleotide. 

50 11. A polynucleotide comprising 10 to 200 continuous bases in the nucleotide sequence of the polynucleotide of any 
one of claims 7 to 1 0, or a polynucleotide comprising a nucleotide sequence complementary to the polynucleotide 
comprising 10 to 200 continuous based. 

12. A recombinant DNA comprising the polynucleotide of any one of claims 8 to 11 . 

55 

13. A transformant comprising the polynucleotide of any one of claims 8 to 11 or the recombinant DNA of claim 12. 

14. A method for producing a polypeptide, comprising: 
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culturing the transformant of claim 13 in a medium to produce and accumulate a polypeptide encoded by the 
polynucleotide of claim 8 or 9 in the medium, and 
recovering the polypeptide from the medium. 

5 15. A method for producing at least one of an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and 
analogues thereof, comprising: 

culturing the transformant of claim 13 in a medium to produce and accumulate at least one of an amino acid, 
a nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof in the medium, and 
10 recovering the at least one of the amino acid, the nucleic acid, the vitamin, the saccharide, the organic acid, 

and analogues thereof from the medium. 

16. A polypeptide encoded by a polynucleotide comprising the nucleotide sequence selected from SEQ ID NOS:2 to 
3431. 

15 

17. A polypeptide comprising the amino acid sequence selected from SEQ ID NOS:3502 to 6931 . 

18. The polypeptide according to claim 16 or 17, wherein at least one amino acid is deleted, replaced, inserted or 
added, said polypeptides having an activity which Is substantially the same as that of the polypeptide without said 

^0 at least one amino acid deletion, replacement, insertion or addition. 

1 9. A polypeptide comprising an amino acid sequence having a homology of at least 60% with the amino acid sequence 
of the polypeptide of claim 1 6 or 1 7, and having an activity which Is substantially the same as that of the polypeptide. 

25 20. An antibody which recognizes the polypeptide of any one of claims 16 to 19. 

21 . A polypeptide array, comprising: 

at least one polypeptide or partial fragment polypeptide selected from the polypeptides of claims 16 to 19 and 
30 partial fragment polypeptides of the polypeptides, and 

a solid support adhered thereto. 

22. A polypeptide array, comprising: 

35 at least one antibody which recognizes a polypeptide or partial fragment polypeptide selected from the polypep- 

tides of claims 16 to 19 and partial fragment polypeptides of the polypeptides, and 
a solid support adhered thereto. 

23. A system based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
40 form bacterium, comprising the following: 

(i) a user input device that inputs at least one nucleotide sequence information selected from SEQ ID NOS:1 
to 3501 , and target sequence or target structure motif Information; 
(il) a data storage device for at least temporarily storing the Input information; 
45 (Hi) a comparator that compares the at least one nucleotide sequence information selected from SEQ ID NOS: 

1 to 3501 with the target sequence or target structure motif information, recorded by the data storage device 
for screening and analyzing nucleotide sequence information which is coincident with or analogous to the 
target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

50 

24. A method based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
form bacterium, comprising the following: 

(i) inputting at least one nucleotide sequence information selected from SEQ ID NOS:1 to 3501, target se- 
55 quence information or target structure motif Information into a user Input device; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one nucleotide sequence information selected from SEQ ID NOS:1 to 3501 with 
the target sequence or target structure motif information; and 
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(iv) screening and analyzing nucleotide sequence information which is coincident with or analogous to the 
target sequence or target structure motif information. 

25. A system based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
form bacterium, comprising the following: 

(i) a user input device that inputs at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001, and target sequence or target structure motif information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 with the target sequence or target structure motif information, recorded by the data storage 
device for screening and analyzing amino acid sequence information which is coincident with or analogous to 
the target sequence or target structure motif information; and 

(iv) an output device that shows a screening or analyzing result obtained by the comparator. 

26. A method based on a computer for identifying a target sequence or a target structure motif derived from a coryne- 
form bacterium, comprising the following: 

(i) inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 , and target 
sequence information or target structure motif information into a user input device; 

(ii) at least temporarily storing said information; 

(iii) comparing the at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 
with the target sequence or target structure motif information; and 

(iv) screening and analyzing amino acid sequence information which is coincident with or analogous to the 
target sequence or target structure motif information. 

27. A system based on a computer for determining a function of a polypeptide encoded by a polynucleotide having a 
target nucleotide sequence derived from a coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one nucleotide sequence information selected from SEQ ID NOS:2 
to 3501, function information of a polypeptide encoded by the nucleotide sequence, and target nucleotide 
sequence information; 

(ii) a data storage device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one nucleotide sequence information selected from SEQ ID NOS: 
2 to 3501 with the target nucleotide sequence information for determining a function of a polypeptide encoded 
by a polynucleotide having the target nucleotide sequence which is coincident with or analogous to the poly- 
nucleotide having at least one nucleotide sequence selected from SEQ ID NOS:2 to 3501; and 

(iv) an output devices that shows a function obtained by the comparator. 

28. A method based on a computer for determining a function of a polypeptide encoded by a polypeptide encoded by 
a polynucleotide having a target nucleotide sequence derived from a coryneform bacterium, comprising the fol- 
lowing: 

(i) inputting at least one nucleotide sequence information selected from SEQ ID NOS:2 to 3501, function in- 
formation of a polypeptide encoded by the nucleotide sequence, and target nucleotide sequence information; 

(ii) at least temporarily storing said Information; 

(iii) comparing the at least one nucleotide sequence information selected from SEQ ID NOS:2 to 3501 with 
the target nucleotide sequence Information; and 

(iv) determining a function of a polypeptide encoded by a polynucleotide having the target nucleotide sequence 
which Is coincident with or analogous to the polynucleotide having at least one nucleotide sequence selected 
from SEQ ID NOS:2 to 3501 . 

29. A system based on a computer for determiriiiig a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

(i) a user input device that inputs at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 , function information based on the amino acid sequence, and target amino acid sequence infor- 
mation; 
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(ii) a data storing device for at least temporarily storing the input information; 

(iii) a comparator that compares the at least one amino acid sequence information selected from SEQ ID NOS: 
3502 to 7001 with the target amino acid sequence Information for determining a function of a polypeptide 
having the target amino acid sequence which Is coincident with or analogous to the polypeptide having at least 

5 one amino acid sequence selected from SEQ ID NOS:3502 to 7001; and 

(Iv) an output device that shows a function obtained by the comparator. 

30. A method based on a computer for determining a function of a polypeptide having a target amino acid sequence 
derived from a coryneform bacterium, comprising the following: 

10 

(I) inputting at least one amino acid sequence information selected from SEQ ID NOS:3502 to 7001 , function 
information based on the amino acid sequence, and target amino acid sequence information; 
(ii) at least temporarily storing said Information; 

(ill) comparing the at least one amino acid sequence Information selected from SEQ ID NOS:3502 to 7001 
15 with the target amino acid sequence information; and 

(Iv) determining a function of a polypeptide having the target amino acid sequence which is coincident with or 
analogous to the polypeptide having at least one amino acid sequence selected from SEQ ID NOS:3502 to 
7001. 

^0 31 . The system according to any one of claims 23, 25, 27 and 29, wherein a coryneform bacterium is a microorganism 
of the genus Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

32. The method according to any one of claims 24, 26, 28 and 30, wherein a coryneform bacterium is a microorganism 
of the genus Corynebacterium, the genus Brevibacterium, or the genus Microbacterium. 

25 

33. The system according to claim 31 , wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium gtutamicum, Corynebacterium acetoacidophiium,. Corynebacterium 
acetoglutamicum, Corynebacterium callunae, Corynebacterium herculis, Corynebacterium iilium, Corynebacteri- 
um melassecola, Corynebacterium thermoaminogenes, and Corynebacterium ammoniagenes, 

30 

34. The method according to claim 32, wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophiium, Corynebacterium 
acetoglutamicum, Corynebacterium callunae, Corynebacterium herculis, Corynebacterium lilium, Corynebacteri- 
um melassecola, Corynsbacterium thermoaminogenes, and Corynebacterium. ammoniagenes. 

35 

35. A recording medium or storage device which is readable by a computer In which at least one nucleotide sequence 
information selected from SEQ ID NOS:1 to 3501 or function information based on the nucleotide sequence is 
recorded, and Is usable in the system of claim 23 or 27 or the method of claim 24 or 28. 



36. A recording medium or storage device which is readable by a computer In which at least one amino acid sequence 
information selected from SEQ ID NOS:3502 to 7001 or function information based on the amino acid sequence 
is recorded, and Is usable In the system of claim 25 or 29 or the method of claim 26 or 30. 

37. The recording medium or storage device according to claim 35 or 36, which is a computer readable recording 
medium selected from the group consisting of a floppy disc, a hard disc, a magnetic tape, a random access memory 
(RAM), a read only memory (ROM), a magneto-optic disc (MO), CD-ROM, CD-R, CD-RW, DVD-ROM, DVD-RAM 
and DVD-RW. 



38. A polypeptide having a homoserine dehydrogenase activity, comprising an amino acid sequence in which the Val 
50 residue at the 59th In the amino acid sequence of homoserine dehydrogenase derived from a coryneform bacterium 

is replaced with an amino acid residue other than a Val residue. 

39. A polypeptide comprising an amino acid sequence in which the Val residue at the 59th position In the amino acid 
sequence as represented by SEQ ID NO:6952 is replaced with an amino acid residue other than a Val residue. 

55 

40. The polypeptide according to claim 38 or 39, wherein the Val residue at the 59th position Is replaced with an Ala 
residue. 
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41. A polypeptide having pyruvate carboxylase activity, comprising an amino acid sequence in which the Pro residue 
at the 458th position in the amino acid sequence of pyruvate carboxylase derived from a coryneform bacterium is 
replaced with an amino acid residue other than a Pro residue. 

5 42. A polypeptide comprising an amino acid sequence in which the Pro residue at the 458th position in the amino acid 
sequence represented by SEQ ID NO:4265 is replaced with an amino acid residue other than a Pro residue. 

43. The polypeptide according to claim 41 or 42. wherein the Pro residue at the 458th position is replaced with a Ser 
residue. 

10 

44. The polypeptide according to any one of claims 38 to 43, which is derived from Corynebacterium glutamicum. 

45. A DNA encoding the polypeptide of any one of claims 38 to 44. 
15 46. A recombinant DNA comprising the DNA of claim 45. 

47. A transformant comprising the recombinant DNA of claim 46. 

48. A transformant comprising in its chromosome the DNA of claim 45. 

20 

49. The transformant according to claim 47 or 48, which is derived from a coryneform bacterium. 

50. The transformant according to claim 49, which is derived from Corynebacterium glutamicum. 

25 51 . A method for producing L-lysine, comprising: 

culturing the transformant of any one of claims 47 to 50 in a medium to produce and accumulate L-lysine in 
the medium, and 

recovering the L-lysine from the culture. 

30 

52. A method for breeding a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:1 to 3431 , comprising the following: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
35 rium which has been subjected to mutation breeding so as to produce at least one compound selected from 

an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 
method, with a corresponding nucleotide sequence in SEQ ID NOS:1 to 3431; 
(li) identifying a mutation point present in the production strain based on a result obtained by (i); 

(iii) introducing the mutation point into a coryneform bacterium which is free of the mutation point, or deleting 
40 the mutation point from a coryneform bacterium having the mutation point; and 

(iv) examining productivity by the fermentation method of the compound selected in (i) of the coryneform 
bacterium obtained in (iii). 

53. The method according to claim 52, wherein the gene Is a gene encoding an enzyme in a biosynthetic pathway or 
45 a signal transmission pathway. 

54. The method accor-ding to claim 52, wherein the mutation point is a mutation point relating to a useful mutation 
which improves or stabilizes the productivity. 

50 55. A method for breading a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:1 to 3431, comprising: 

(i) comparing a nucleotide sequence of a genome or gene of a production strain derived a coryneform bacte- 
rium which has been subjected to mutation breeding so as to produce at least one compound selected from 

55 an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof by a fermentation 

method, with a corresponding nucleotide sequence in SEQ ID NOS:1 to 3431; 

(ii) identifying a mutation point present in the production strain based on a result obtain by (i); 

(iii) deleting a mutation point from a coryneform bacterium having the mutation point; and 

/ 



240 



EP1 108 790 A2 



(iv) examining productivity by the fermentation method of the compound selected In (1) of the coryneform 
bacterium obtained in (iii). 

56. The method according to claim 55, wherein the gene is a gene encoding an enzyme in a biosynthetic pathway or 
5 a signal transmission pathway. 

57. The method according to claim 55, wherein the mutation point Is a mutation point which decreases or destabilizes 
the productivity. 

10 58. A method for breeding a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:2 to 3431, comprising the following: 

(i) identifying an isozyme relating to biosynthesis of at least one compound selected from an amino acid, a 
nucleic acid, a vitamin, a saccharide, an organic acid, and analogous thereof, based on the nucleotide se- 
15 quence information represented by SEQ ID NOS:2 to 3431 ; 

(il) classifying the isozyme identified in (i) into an Isozyme having the same activity; 

(iii) mutating all genes encoding the isozyme having the same activity simultaneously; and 

(iv) examining productivity by a fermentation method of the compound selected in (i) of the coryneform bac- 
terium which have been transformed with the gene obtained in (iii). 

20 

59- A method for breeding a coryneform bacterium using the nucleotide sequence information represented by SEQ 
ID NOS:2 to 3431, comprising the following: 

(i) arranging a function information of an open reading frame (ORF) represented by SEQ ID NOS:2 to 3431; 
^5 (il) allowing the arranged ORF to correspond to an enzyme on a known biosynthesis or signal transmission 

pathway; 

(iii) explicating an unknown biosynthesis pathway or signal transmission pathway of a coryneform bacterium 
in combination with information relating known biosynthesis pathway or signal transmission pathway of a co- 
ryneform bacterium; 

30 (iv) comparing the pathway explicated in (iii) with a biosynthesis pathway of a target useful product; and 

(v) transgenetically varying a coryneform bacterium based on the nucleotide sequence information to either 
strengthen a pathway which is judged to be important in the biosynthesis of the target useful product in (iv) or 
weaken a pathway which is judged not to be important In the biosynthesis of the target useful product in (iv). 

35 60. A coryneform bacterium, bred by the method of any one of claims 52 to 59. 

61. The coryneform bacterium according to claim 60, which is a microorganism belonging to the genus Corynebacte- 
rium, the genus Brevibacterium, or the genus Microbacterium. 

40 62. The coryneform bacterium according to claim 61 , wherein the microorganism belonging to the genus Corynebac- 
terium is selected from the group consisting of Corynebacterium glutamicum, Corynebacterium acetoacidophilum, 
Corynebacterium acetoglutamicum, Corynebacterium caliunae, Corynebacterium herculis, corynebacterium lil- 
ium, Corynebacterium melassecola, Corynebacterium thermoamino genes, and Corynebacterium ammonia 
genes. 

45 

63. A method for producing at least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, 
an organic acid and an analogue thereof, comprising: 

culturing a coryneform bacterium of any one of claims 60 to 62 in a medium to produce and accumulate at 
50 least one compound selected from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, 

and analogues thereof; 
recovering the compound from the culture. 

64. The method according to claim 63, wherein the compound is L-lysine. 

55 

65. A method for Identifying a protein relating to useful mutation based on proteome analysis, comprising the following: 

(i) preparing 
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a protein derived from a bacterium of a production strain of a coryneform bacterium wliich has been sub- 
jected to mutation breeding by a fermentation process so as to produce at least one compound selected 
from an amino acid, a nucleic acid, a vitamin, a saccharide, an organic acid, and analogues thereof, and 
a protein derived from a bacterium of a parent strain of the production strain; 

5 

(ii) separating the proteins prepared in (i) by two dimensional electrophoresis; 

(iii) detecting the separated proteins, and comparing an expression amount of the protein derived from the 
production strain with that derived from the parent strain; 

(iv) treating the protein showing different expression amounts as a result of the comparison with a peptidase 
10 to extract peptide fragments; 

(v) analyzing amino acid sequences of the peptide fragments obtained in (iv); and 

(vi) comparing the amino acid sequences obtained in (v) with the amino acid sequence represented by SEQ 
ID NOS:3502 to 7001 to identifying the protein having the amino acid sequences. 

15 66. The method according to claim 65, wherein the coryneform bacterium is a microorganism belonging to the genus 
corynebacterium, the genus Brevibacterium, or the genus Microbacterium, 

67. The method according to claim 66, wherein the microorganism belonging to the genus Corynebacterium is selected 
from the group consisting of Corynebacterium giutamicum, Corynebacterium acetoacidophifum, Corynebacterium 

20 acetog/utamicum, Corynebacterium cailunae, Corynebacterium hercuiis, Corynebacterium liiium, Corynebacteri- 

um melassecola, Corynebacterium ttiermoaminogenes, and Corynebacterium ammoniagenes. 

68. A biologically pure culture of Corynebacterium giutamicum AHP -3 (PERM BP-7382) . 

25 
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